본문 바로가기
학교/CAI

Lecture 3: Trust part 1

by Hongwoo 2024. 3. 3.
반응형

수업 전 준 자료 (논문): Jacovi et al. Formalizing Trust in AI

 

요약본:

Trust is a central component of the interaction between people and AI, in that ’incorrect’ levels of trust may cause misuse, abuse or disuse of the technology

This model rests on two key properties of the vulnerability of the user and the ability to anticipate the impact of the AI model’s decisions. We incorporate a formalization of ‘contractual trust’, such that trust between a user and an AI is trust that some implicit or explicit contract will hold, and a formalization of ‘trustworthiness’ (which detaches from the notion of trustworthiness in sociology), and with it concepts of ’warranted’ and ‘unwarranted’ trust.

 

Risk is a prerequisite to the existence of Human-AI trust. We refer to risk as a disadvantageous or otherwise undesirable event to the trustor (that is a result of interacting with the trustee), which can possibly—but not certainly—occur

 

Contractual trust is a model of trust in which a trustor has a belief that the trustee will stick to a specific contract.

the contract may refer to any functionality which is deemed useful, even if it is not concrete performance at the end-task that the model was trained for. Therefore, model correctness is only one instance of contractual trust

 

The formalization of contracts allows us to clarify the goal of anticipation in Human-AI trust: contracts specify the behavior to be anticipated, and to trust the AI is to believe that a set of contracts will be upheld

 

Trust, as mentioned, aims to enable the ability to anticipate intended behavior through the belief that a contract will be upheld. Further, as mentioned in Section 2, the ability to anticipate does not necessarily manifest with the existence of trust; it is possible for a user to trust a model despite their inability to anticipate its behavior. In other words, the belief exists, but may or may not be followed by the desired behavior.

 

given that the user trusts the AI model, anticipation depends on whether the model is able to carry out its contract. This perspective distinguishes “trust” (an attitude of the trustor) from being “trustworthy” (a property of the trustee) [64, 75], and we say that an AI model is trustworthy to some contract if it is capable of maintaining this contract.

 

We say that the trust is warranted if it is the result of trustworthiness, and otherwise it is unwarranted [50]. Warranted trust is sometimes referred to as trust which is calibrated with trustworthiness [43]. In other words, trust is the cognitive mechanism to give the ‘illusion’ of anticipating intended behavior, which becomes reality when the trust is warranted, and the trustor feels “betrayed” when the illusion is broken

 

Formally, we define warranted Human-AI trust via a causal (interventionist) relationship with trustworthiness: incurred Human-AI trust is warranted if the trustworthiness of the model can be theoretically manipulated to affect the incurred trust. Note that by this definition, it is possible for a trustworthy model to incur unwarranted Human-AI trust—in this case, the trust will not be betrayed, even though it is unwarranted.

 

when pursuing Human-AI trust, unwarranted trust should be explicitly evaluated against, and avoided or otherwise minimized

 

if the AI is incapable of maintaining some relevant contract, it is a desired outcome (desired by the AI developer) that the user develop warranted distrust in that contrac

 

 

Formal Definitions

Trustworthy AI. An AI is trustworthy to contract C if it is capable of maintaining the contract.

 

Human-AI trust. If H (human) perceives that M (AI model) is trustworthy to contract C, and accepts vulnerability to M’s actions, then H trusts M contractually to C. The objective of H in trusting M is to anticipate that M will maintain C in the presence of uncertainty, and consequently, trust does not exist if H does not perceive risk.

 

Warranted and unwarranted Human-AI trust. H’s trust in M (to C) is warranted if it is caused by trustworthiness in M. This holds if it is theoretically possible to manipulate M’s capability to maintain C, such that H’s trust in M will change. Otherwise, H’s trust in M is unwarranted.

 

Human-AI distrust. If H (human) perceives that M (AI) is not trustworthy to contract C, and therefore does not accept vulnerability to M’s actions, then H distrusts M contractually to C. We say that it is warranted distrust if the distrust is caused by the nontrustworthiness of M.

 

6.1 Intrinsic Trust A model is more trustworthy when the observable decision process of the model matches user priors on what this process should be.

Explanation in AI aims to explain the decision process of the AI to the user, such that they can understand why and how the decision was made. However, the process of explaining does not, in itself, enable intrinsic trust. Only when (1) the user successfully comprehends the true reasoning process of the model, and (2) the reasoning process of the model matches the user’s priors of agreeable reasoning, intrinsic trust is gained.

If the user has no prior on what behavior is trustworthy for the given task, intrinsic trust will not be gained, even if the AI is easy to understand

 

6.2 Extrinsic Trust

It is additionally possible for a model to become trustworthy not through explanation, but through behavior: in this case, the source of trust is not the decision process of the model, but the evaluation methodology or the evaluation data. For AI models, extrinsic trust is, in essence, trust in the evaluation scheme. To increase extrinsic trust is a matter of justifying that a model can generalize to unseen instances based on expected behavior of the model on other unseen instances.

 

 

___

What is Trust?

Trust: if A believes that B will act in A's best interest, and accepts vulnerability to B's actions, then A trusts B.

- Attitude, belief, intention or reliance

- Based on imperfect information or uncertainty

- About another agent or about other entities in general

- In a situation where some risk is involved

 

When do we need trust?

We need trust to:

- mitigate uncerta

 

Trust vs Trustworthiness

Trust is directional from one agent to another

e.g. X trusts Y → Attitude of X

 

Trustworthiness is inherent to an agent

e.g. X is trustworthy (신뢰할 수 있는) → Property of X

 

Someone is trustworthy w.r.t. some contract if they will maintain this contract

 

Appropriate

Appropriate trust → trust is appropriate given trustworthiness

 

 

 

What is Trust

if A believes that B

 

Why do we need trust

- to mitigate uncertainty and risk by:

- enabling the anticipation that:

- others will act in our best interest

 

Risk

Trust is re

 

반응형

'학교 > CAI' 카테고리의 다른 글

Lecture 6: Social Choice Part 2  (0) 2024.03.03
Lecture 5: Computational Social Choice Part 1  (0) 2024.03.03
Lecture 4: Trust part 2  (0) 2024.03.03
Lecture 2: Coactive Design  (0) 2024.03.03
Lecture 1: Introduction  (0) 2024.03.03

댓글