본문 바로가기
학교/CAI

Lecture 10: Automated Negotiation

by Hongwoo 2024. 3. 27.
반응형

목차

     

    Automated Negotiation

    Intelligent software agents negotiate on behalf of their users.

     

     

    How can a software agent negotiate its user's behalf?

    Evaluating bids: reasoning on its user's preferences

     

    Employing a negotiation strategy

    → Which action the agent will take

    → How the agent will generate its offer

    → When the agent will accept the opponent's counter offer

     

    Communicating with other agent(s) based on predefined rules (negotiation protocol)

     

     

    Normalization

    You need first to normalize all evaluation functions (valuation functions)

    For discrete domains:

    - Pick the item of the issue's range that has maximum value, the normalized evaluation function for that issue maps that item to 1.

    - For all other items in the range normalize in proportion to the maximum value to determine it's normalized value

     

    즉, 표준화시키려면 maximum value 구해서 이 값을 1로 바꾼 다음 나머지 값들을 비례해서 [0. 1] 사이의 값이 되게 하면 된다.

     

     

    Negotiation Protocols

    A negotiation protocol governs the interaction between negotiating parties by detmining:

    → how the parties interact/exchange information

    → when the negotiation ends

     

    Bilateral Negotiation → Alternating Offers Protocol (제안들을 주고받는 프로토콜)

     

    Multiparty Negotiation → Mediated Single Text Protocol

     

     

    Alternating Offers Protocol

    One of the agents initiates negotiation with an offer.

    The agent receiving an offer can accept the offer, make a counter-offerend negotiation.

    This process continues in a turn-taking fashion until having a consensus or reaching a termination condition.

    한 줄 요약: 한 에이전트가 협상 시작. 제안들을 주고받고 제안을 받은 에이전트는 수락, 대안 (수정 제안)을 제시, 또는 협상을 결렬할 수 있다. 이 과정은 합의가 되거나 아니면 종료 조건이 맞을 때까지 지속.

     

     

    Mediated Single Text Negotiation Protocol

    Mediator generates an offer and asks negotiating agents for their votes either to accept or to reject.

     

     

    Negotiating agents send their votes for the current bid according to their acceptance strategy.

    If all negotiating agents vote "accept", the bid is labeled as the most recently accepted.

     

     

     

    Mediator modifies the most recently accepted bid by exchanging one value arbitrary and asks negotiating agents' votes again.

    It updates the most recently accepted bid if all negotiating agents vote as "accept".

    This process continues iteratively until reaching a predefied number of bids.

    Problem: it's not that random (maybe biased, not random enough --> local minimum --> need to bring in more randomness)

     

     

    The role of the mediator is to propose new ideas that are acceptable to all.

    - Mediated Hill-Climber Agent

    - Mediated Annealer Agent

     

     

    Mediated Hill Climber Agent

    Accept a bid if its utility is higher than the utility of the most recently accepted bid.

    If the utility of initial bid is quite high for one of the agents, that agent may not accept other bids even though those bids might be better for the majority.

     

     

    Mediated Annealer Agent

    Calculates the probability of acceptance for the current bid:

    T: Virtually temperature gradually declines over time

     

    Higher probability for acceptance:

    - The utility difference is small & virtual temperature is high

    Tendency to accept individually worse bids earlier so the agents find win-win bids later

     

     

     

    Negotiation Strategies

    Negotiation strategy determines:

    - which action the agent will take

    - how the agent will generate its offers

    - how the agent decide whether the opponent's counter-offer is acceptable

     

     

     

    Bidding Strategies 

    Random Walker

    It generates an offer randomly as follows:

    - Selects values of issues randomly

    - Proposes only those bids whose own utility greater than its reservation utility (RU = 0.6)

     

     

    Time-Dependent Concession Strategy

    Concession: 양보

    Each agent has a deadline and the agent's behavior changes with respect to the time.

    An offer which is not acceptable at the beginning, may become acceptable over time (conceding while approaching the deadline)

    A function determines how much the agent will concede:

    - Remainng negotiation time

    - Parameter related to concession speed (β)

     

    Conceder Tactic:  β > 1 concedes fast and goes to its reservation value quickly

    Boulware Tactic: β < 1 hardly concedes until the deadline

     

     

    Trade-Off Strategy

    Not only considers it own utility but also take its opponent's utility into account.

    The importance of the isseus may be different for the negotiating agents.

    - E.g. delivery time might be more important for the consumer

    The agent may demand more on some issues while concedes on other issues without changing its overall utility as if possible.

    - E.g. higher price in order to have an earlier delivery

     

    Selects a subset of bids having the same utility with its previous offer (iso-curve)

    → if not possible, it makes minimal concession such as 0.05.

    Among those bids, choose the bids which might be more preferred by its opponent

    → Heuristic: the most similar one to opponent's last bid

     

     

    Behavior Dependent Strategies

    The agent imitates its opponent's behavior.

    The degree of imitation may vary:

    Tit-For-Tat : 보복, 앙갚음

    Absolute Tit-For-Tat: the opponent increases the price by 50 units then the agent will decrease the price by 50 units

    Relative (proportional) Tit-For-Tat: Taking into account the changes of its opponent's behavior in a number of previous steps

    Averaged Tit-For-Tat: Taking into account the average changes within a window of size of its opponent history

     

     

     

    Opponent Modelling Strategies

    Opponent Modelling, Why?

    Exploit the opponent

    Maximize chance of reaching an agreement

    - Requiring outcome with acceptable utility for opponent, i.e. resolving the conflict of interest

    Increase the efficiency of a negotiated agreement

    - Searching through the outcome space for outcomes that are mutually beneficial (즉, 양쪽에 이득이 되는 상황)

    - Reaching better / optimal agreements

    Avoid unfortunate moves, which is worse for both agents

    Make trade-offs and maximize social welfare

    Reach agreements early, reducing communication cost

     

     

    Opponent Modelling, What?

    Learning which issues are important for the opponent  →  issue weights

    Learning opponent's preferences

    → Evaluation of issue values

    → Preference ordering of issue values

    Learning about opponent's strategy

    → Predicting the utility of its next offer

    Learning what kind of offers are not acceptable

    → Reservation value

    → Constraints

     

     

    Acceptance Conditions

    Introduction

    In every negotiation with a deadline, one of the negotiating parties has to accept an offer to avoid a break off.

    A break off is usually an undesirable outcome (협상 결렬)

    3 decisions - close the negotiation, make a counter offer, walk away

    Utility below the reservation value -> walk away

     

    When designing such conditions one is faced with the acceptance dilemma:

    - Accepting too early may result in suboptimal agreements

    - Accepting too late may result in a break off

    We have to find a balance:

     

     

     

    Selecting of Existing Acceptance Conditions

    AC_const (α): Accept when the opponent's bid is better than α 

    AC_next: Accept when the opponent's bid is better than our upcoming bid

    AC_time (T): Accept when time T ∈ [0, 1] has passed

     

     

    Combining Acceptance Conditions

    We can also combine acceptance conditions, e.g.

     

    We can also choose non-constant values for α, such as average utility so far received (AVG) or maximim utility (MAX).

     

    여기에서 알 수 있는 점들:

    1. ACconst(0.9): 상대 제안이 utility 0.9보다 높으면 수락. 따라서, 협의할 가능성이 낮은 대신에 합의를 하면 utility 값은 높다.

    2. ACnext: 상대의 제안이 우리가 생각한 다음 제안보다 높으면 수락. 따라서, 어느 정도 합의할 확률도 높고 합의를 할 때 utility값도 꽤 높은 편. 단, ACconst(0.9)보다는 낮다.

    3. ACtime(0.99): 거의 deadline이 다가왔을 때 수락. 따라서, 합의할 확률은 되게 높은 대신에 합의했을 때의 utility는 중간 정도.

    4. Original

    5. ACcombi(AVG, 0.99): 

     

     

    Conclusion

    AC_next is often used, but does not always give the best results.

    AC_const(α) performs worst of all AC's, as a good value for α is highly domain-dependent.

    AC_time (T) always reaches an agreement, but of relatively low utility.

     

     

    Challenges in Automated Negotiation

    Designing negotiation protocols & strategy

    Representing and Reasoning on Preferences in Negotiation

    Predicting other Agent's preferences during negotiation

    Acceptance strategies (can you infer which strategies the other party is using?  - 알면 manipulation 가능)

     

     

    Analysis of Negotiation Dynamics

    Analysis of negotiation strategies

    What kind of bids to make

    Process analysis:

    → Step analysis

    → Dynamic properties

    What kinds of bids to accept

    Outcome analysis:

    → Nash product

    → Kalai-Smorodinsky

    → Pareto-optimal

     

     

    Outcome Analysis

     

     

    Negotiation Traces

     

     

    Utility, Negotiation Steps, and Traces

     

     

     

    Concession Step

    Concession step: 내 utility는 줄어들고, 다른 파티의 utility는 같거나 증가.

     

     

    Unfortunate Step

     

    Unfortunate step: 내 utility 줄어들거나 같고 상대방의 utility는 감소.

     

     

    Fortunate Step

    내 utility와 상대방 utility 둘 다 증가.

     

     

    Selfish Step

    Selfish Step: 내 utility는 증가. 상대방은 같거나 감소

     

     

    Silent step

    똑같이 유지인 듯.

     

    Nice step

    본인한테는 그대로, 남에게는 더 좋게.

     

     

    Summary

    Concession Step: (S -, O ≥)

    Unfortunate Step: (S ≤, O -)

    Fortunate: (S+ , O+)

    Selfish Step: (S+, O ≤)

    Silent Step: (S=, O=)

    Nice Step: (S=, O+)

     

    Classification of Negotiation Steps

     

     

    The Three Strategies

    ABMP:

    - Does not use any knowledge about opponent

    - Calculates concession step on every round of negotiation

    - Always make concession on every issue

     

    Trade-off:

    - Uses domain knowledge

    - Tries to find bids on the same iso-level of own utility function that is closer to the current opponent's bid, makes concession of 0.05 if stuck

    - Uses opponent's bid to make trade-offs

     

    Random-Walker:

    - Selects values of issues randomly

    - Proposes only those bids that have own utility > 0.6

     

     

    The Three Domains

    Second hand car selling domain:

    - 5 issues (4 discrete issues and pricce issue)

    - Only the buyer's preferenes and the price issue are predictable

     

    Service-oriented negotiation (SON):

    - 4 continuous issues

    - All issues are predictable

     

    AMPO vs City

    - 10 issues

    - only 8 issues are predictable

     

     

     

     

     

     

     

     

     

    Outcome Utility

    Overall utility:

    - ABMP: 0.72

    - Trade-Off: 0.74

    - Random Walker: 0.69

     

    Trade-Off:

    - Outperforms ABMP on the SON domain with complete information and on the AMPO vs City domain

    - Underperforms wrt ABMP on the secodn hand car due to wrong weights and unpredictable issues → (need to test it on different domains, instead of only on one domain)

     

    ABMP:

    - Strong on the second hand car domain

    - Underperforms on the SON domain

     

     

    Conclusions

    To negotiate efficiently → know your partner

    It is impossible to avoid unfortunate steps without sufficient domain knowledge or opponent knowledge.

    In the analysis of negotiation strategies, not only the outcome of a negotiation is relevant, but also the bidding process itself is important.

    When developing a general negotiation strategy test against many opponents and in many domains.

     

     

    The BOA Framework

    Many agent strategies are comprised of a fixed set of modules; generally, a distinction is made between three different modules:

    B: Bidding Strategy: One that decideds which set of bids could be proposed next

    O: Opponent Model: One that tries to guess the opponent's preferences and takes this into account when selecting an offer to send out

    A: Acceptance Strategy: One module that decides whether the opponent's bid is acceptable

     

     

     

     

     

    반응형

    '학교 > CAI' 카테고리의 다른 글

    Lecture 11: Ethics of CAI: Values and Norms  (0) 2024.04.08
    Lecture 9: Negotiation Formalization  (1) 2024.03.27
    Lecture 8: Negotiation  (1) 2024.03.26
    Lecture 7: Computational Coalition Formation  (5) 2024.03.06
    Lecture 6: Social Choice Part 2  (0) 2024.03.03

    댓글