Contact us
Site map
Français

Publications


Expend abstracts - No abstract

2013


Articles in Referred Journals

Repeated games for multiagent systems: a survey.
Andriy Burkov, Brahim Chaib-draa and , In The Knowledge Engineering Review, 1--30, 2013, (pdf) (bib).

Articles in Referred Proceedings

A KNN Based Kalman Filter Gaussian Process Regression.
Yali Wang and Brahim Chaib-draa, In International Joint Conference on Artificial Intelligence (IJCAI), 2013 (bib).


2012


Articles in Referred Journals

Stochastic Resource Allocation in Multiagent Environments: an Approach based on Distributed Q-Values and Bounded Real-Time Dynamic Programming.
Pierrick Plamondon and Brahim Chaib-draa, In Int. Journal of Artificial Intelligence Tools, 2012, (pdf) (bib).

Apprenticeship learning with few examples.
Abdeslam Boularias, Brahim Chaib-draa and , In Neurocomputing, 2012, (pdf) (bib).

Building adaptive dialogue systems via Bayes-adaptive POMDPs.
Shaowei Png, Joelle Pineau, Brahim Chaib-draa and , In IEEE Journal of Selected Topics in Signal Processing, 917--927, 2012, (pdf) (bib).

Articles in Referred Proceedings

An Adaptive Nonparametric Particle Filter for State Estimation.
Yali Wang, Brahim Chaib-draa and , In IEEE International Conference on Robotics and Automation (ICRA), 2012, (pdf) (bib).

Learning Observation Models for Dialogue POMDPs.
Hamid R. Chinaei, Brahim Chaib-draa and Luc Lamontagne, In 25th Canadian Conference on Artificial Intelligence (AI'2012), 2012, (pdf) (bib).

A Marginalized Particle Gaussian Process Regression.
Yali Wang, Brahim Chaib-draa and , In Neural Information Processing Systems (NIPS'2012), 2012, (pdf) (bib).

An Inverse Reinforcement Learning Algorithm for Partially Observable Domains with Application on Healthcare Dialogue Management.
Hamid R. Chinaei and Brahim Chaib-draa, In 11th International Conference on Machine Learning and Applications (ICMLA'2012), 2012, (pdf) (bib).


2011


Articles in Referred Journals

An approximate inference with Gaussian process to latent functions from uncertain data.
Patrick Dallaire, Camille Besse, Brahim Chaib-draa and , In Neurocomputing, 2011, (pdf) (bib).

A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes.
Stéphane Ross, Joelle Pineau, Brahim Chaib-draa and Pierre Kreitmann, In Journal of Machine Learning Research, 1655--1696, 2011, (pdf) (bib).

Cooperative Adaptative Cruise Control: A Reinforcement Learning Approach.
Desjardins, Charles and Chaib-draa, Brahim and , In IEEE Transactions on Intelligent Transportation Systems, 2011, (pdf) (bib).

Articles in Referred Proceedings

Toward Error-bounded Algorithms for Infinite-Horizon DEC-POMDPs.
Jilles S. Dibangoye, Abdel-Illah Mouaddib and Brahim Chaib-draa, In Proceedings of 10th International Conference on AAMAS, 2011, (pdf) (bib).

Learning Dialogue POMDP Models from Data.
Hamid R. Chinaei, Brahim Chaib-draa and , In 24th Canadian Conference on Artificial Intelligence (AI'2011), 2011, (pdf) (bib).

Tactile Perception for Surface Identification Using a Triple Axis Accelerometer Probe.
Dallaire, Patrick, Edmond, Daniel, Giguère, Philippe and Chaib-draa, Brahim, In International Symposium on Robotics and Sensors Environments (ROSE), Montréal, Canada, 2011, (pdf) (bib).


2010


Articles in Referred Journals

Task allocation learning in a multiagent environment: Application to the RoboCupRescue.
Sébastien Paquet, Brahim Chaib-draa, Patrick Dallaire, Danny Bergeron and , In Multiagent and Grid Systems, 2010, (pdf) (bib).

Book Chapters

Stochastic Games.
Andriy Burkov and Brahim Chaib-draa, In Markov Decision Processes and Artificial Intelligence, Wiley - ISTE, 2010 (bib).

Articles in Referred Proceedings

Apprenticeship Learning via Soft Local Homomorphisms.
Abdeslam Boularias and Brahim Chaib-draa, In Proceedings of 2010 IEEE International Conference on Robotics and Automation (ICRA'10), Anchorage, USA, 2010, (pdf) (bib).

Solving the Continuous Time Multiagent Patrol Problem.
Jean-Samuel Marier, Camille Besse and Brahim Chaib-draa, In Proceedings of 2010 IEEE International Conference on Robotics and Automation (ICRA'10), 2010, (pdf) (bib).

Quasi-Deterministic POMDPs and Dec-POMDPs.
Camille Besse, Brahim Chaib-draa and , In Proceedings of 5th International Workshop On Multiagent Sequential Decision Making in Uncertain Domains, 2010, (pdf). A shorter version also appeared in Proceedings of 9th International Conference On Autonomous Agents and MultiAgent Systems (Extended Abstract), 2010, (pdf) (bib).
+AbstractIn this paper we study a particular subclass of partially observable models called quasi-deterministic partially observable Markov decision processes (QDetPOMDPs) characterized by deterministic transitions and stochastic observations. While this framework does not model the same general problems as POMDPs they still capture a number of interesting and challenging problems and have in some cases interesting properties. By studying the observability available in this subclass we suggest that QDetPOMDPs may fall many steps in the complexity hierarchy. An extension of this framework to the decentralized case also reveals a subclass of numerous problems that can be approximated in polynomial space. Finally a sketch of e-optimal algorithms for these classes of problems is given and empirically evaluated.

An Approximate Subgame-Perfect Equilibrium Computation Technique for Repeated Games.
Andriy Burkov, Brahim Chaib-draa and , In Proceedings of Twenty-Fourth Conference on Artificial Intelligence (AAAI'10), 2010, (pdf) (bib).
+AbstractThis paper presents a technique for approximating, up to any precision, the set of subgame-perfect equilibria (SPE) in repeated games with discounting. The process starts with a single hypercube approximation of the set of SPE payoff profiles. Then the initial hypercube is gradually partitioned on to a set of smaller adjacent hypercubes, while those hypercubes that cannot contain any SPE point are gradually withdrawn. Whether a given hypercube can contain an equilibrium point is verified by an appropriate mixed integer program. A special attention is paid to the question of extracting players' strategies and their representability in form of finite automata.

Bootstrapping Apprenticeship Learning.
Abdeslam Boularias, Brahim Chaib-draa and , In Advances in Neural Information Processing Systems 24 (NIPS'10), 2010, (pdf) (bib).

Learning the Reward Model of Dialogue POMDPs from Data.
Abdeslam Boularias, Hamid R. Chinaei, Brahim Chaib-draa and , In NIPS 2010 workshop of Machine Learning for Assistive Techniques, 2010, (pdf) (bib).


2009


Articles in Referred Journals

Effective Learning in the Presence of Adaptive Counterparts.
Andriy Burkov, Brahim Chaib-draa and , In Journal of Algorithms, 127--138, 2009, (pdf) (bib).
+AbstractAdaptive learning algorithms (ALAs) is an important class of agents that learn the utilities of their strategies jointly with the maintenance of the beliefs about their counterparts' future actions. In this paper, we propose an approach of learning in the presence of adaptive counterparts. Our Q-learning based algorithm, called Adaptive Dynamics Learner (ADL), assigns Q-values to the fixed-length interaction histories. This makes it capable of exploiting the strategy update dynamics of the adaptive learners. By so doing, ADL usually obtains higher utilities than those of equilibrium solutions. We tested our algorithm on a substantial representative set of the most known and demonstrative matrix games. We observed that ADL is highly effective in the presence of such ALAs as Adaptive Play Q-learning, Infinitesimal Gradient Ascent, Policy Hill-Climbing and Fictitious Play Q-learning. Further, in self-play ADL usually converges to a Pareto efficient average utility.

Book Chapters

Learning Agents for Collaborative Driving.
Charles Desjardins, Julien Laumonier and Brahim Chaib-draa, In Multi-Agent Systems for Traffic and Transportation Engineering, IGI Global, 240--260, 2009, (pdf) (bib).
+AbstractThis chapter studies the use of agent technology in the domain of vehicle control. More specifically it illustrates how agents can address the problem of collaborative driving. First the authors briefly survey the related work in the field of intelligent vehicle control and inter-vehicle cooperation that is part of Intelligent Transportation Systems (ITS) research. Next they detail how these technologies are especially adapted to the integration for decision making of autonomous agents. In particular they describe an agent-based cooperative architecture that aims at controlling and coordinating vehicles. In this context the authors show how reinforcement learning can be used for the design of collaborative driving agents and they explain why this learning approach is well-suited for the resolution of this problem.

Articles in Referred Proceedings

Policy Iteration Algorithms for DEC-POMDPs with Discounted Rewards.
Jilles S. Dibangoye, Brahim Chaib-draa, Abdel-Illah Mouaddib and , In Proceedings of the Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains (MSDM'09), Budapest, Hungary, 2009, (pdf) (bib).
+AbstractOver the past seven years, researchers have been trying to find algorithms for the decentralized control of multiple agent under uncertainty. Unfortunately, most of the standard methods are unable to scale to real-world-size domains. In this paper, we come up with promising new theoretical insights to build scalable algorithms with provable error bounds. In the light of the new theoretical insights, this research revisits the policy iteration algorithm for the decentralized partially observable Markov decision process (DEC-POMDP). We derive and analyze the first point-based policy iteration algorithmswith provable error bounds. Our experimental results show that we are able to successfully solve all tested DEC-POMDP benchmarks: outperforming standard algorithms, both in solution time and policy quality.

Point-based Incremental Pruning Heuristic for Solving Finite-Horizon DEC-POMDPs.
Jilles S. Dibangoye, Abdel-Illah Mouaddib, Brahim Chaib-draa and , In Proceedings of The 8th International AAMAS Conference, 2009, (pdf) (bib).
+AbstractRecent scaling up of decentralized partially observable Markov decision process (DEC-POMDP) solvers towards realistic applications is mainly due to approximate methods. Of this family, MEMORY BOUNDED DYNAMIC PROGRAMMING (MBDP), which combines in a suitable manner top-down heuristics and bottom-up value function updates, can solve DEC-POMDPs with large horizons. The performances of MBDP, can be, however, drastically improved by avoiding the systematic generation and evaluation of all possible policies which result from the exhaustive backup. To achieve that, we suggest a heuristic search method, namely POINT BASED INCREMENTAL PRUNING (PBIP), which is able to distinguish policies with different heuristic estimates. Taking this insight into account, PBIP searches only among the most promising policies, finds those useful, and prunes dominated ones. Doing so permits us to reduce clearly the amount of computation required by the exhaustive backup. The computation experiment shows that PBIP solves DECPOMDP benchmarks up to 800 times faster than the current best approximate algorithms, while providing solutions with higher values.

Learning User Intentions in Spoken Dialogue Systems.
Hamid R. Chinaei and B. Chaib-draa, In Proceedings of 1st International Conference on Agents and Artificial Intelligence (ICAART'09) - Best student paper Award - , 2009, (pdf) (bib).
+AbstractA common problem in spoken dialogue systems is finding the intention of the user. This problem deals with obtaining one or several topics for each transcribed, possibly noisy, sentence of the user. In this work, we apply the recent unsupervised learning method, Hidden Topic Markov Models (HTMM), for finding the intention of the user in dialogues. This technique combines two methods of Latent Dirichlet Allocation (LDA) and Hidden Markov Model (HMM) in order to learn topics of documents. We show that HTMM can be also used for obtaining intentions for the noisy transcribed sentences of the user in spoken dialogue systems. We argue that in this way we can learn possible states in a speech domain which can be used in the design stage of its spoken dialogue system. Furthermore, we discuss that the learned model can be augmented and used in a POMDP (Partially Observable Markov Decision Process) dialogue manager of the spoken dialogue system.

Topological Order Planner for POMDPs.
Jilles S. Dibangoye, Guy Shani, Brahim Chaib-draa, Abdel-Illah Mouaddib and , In International Joint Conference of Artificial Intelligence (IJCAI), 2009, (pdf) (bib).
+AbstractOver the past few years, point-based POMDP solvers scaled up to produce approximate solutions to mid-sized domains. However, to solve real world problems, solvers must exploit the structure of the domain. In this paper we focus on the topological structure of the problem, where the state space contains layers of states. We present here the Topological Order Planner (TOP) that utilizes the topological structure of the domain to compute belief space trajectories. TOP rapidly produces trajectories focused on the solveable regions of the belief space, thus reducing the number of redundant backups considerably. We demonstrate TOP to produce good quality policies faster than any other pointbased algorithm on domains with sufficient structure.

Predictive Representations for Policy Gradient in POMDPs.
Abdeslam Boularias, Brahim Chaib-draa and , In Proceedings of the Twenty-sixth International Conference on Machine Learning (ICML'09), 2009, (pdf) (bib).
+AbstractWe consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive State Representations (PSRs). We compare PSR policies to Finite-State Controllers (FSCs), which are considered as a standard model for policy gradient methods in POMDPs. We present a general Actor-Critic algorithm for learning both FSCs and PSR policies. The critic part computes a value function that has as variables the parameters of the policy. These latter parameters are gradually updated to maximize the value function. We show that the value function is polynomial for both FSCs and PSR policies, with a potentially smaller degree in the case of PSR policies. Therefore, the value function of a PSR policy can have less local optima than the equivalent FSC, and consequently, the gradient algorithm is more likely to converge to a global optimal solution.

Multiagent Learning and Optimality Criteria in Repeated Game Self-play.
Andriy Burkov, Brahim Chaib-draa and , In Actes des Cinquièmes Journées Francophones Modèles formels de l’interaction, 93--100, 2009, (pdf) (bib).
+AbstractWe present a multiagent learning approach to satisfy any given optimality criterion in repeated game self-play. Our approach is opposed to classical learning approaches for repeated games: namely, learning of equilibrium, Pareto efficient learning, and their variants. The comparison is given from a practical (or engineering) standpoint, i.e., from a point of view of a multiagent system designer whose goal is to maximize the system's overall performance according to a given optimality criterion. Extensive experiments in a wide variety of repeated games demonstrate the efficiency of our approach.

Bayesian Reinforcement Learning in POMDPs with Gaussian Processes.
P. Dallaire, C. Besse, S. Ross and B. Chaib-draa, In Proceedings of the International Conference on Intellegent Robots and Systems (IROS), , 2009, (pdf) (bib).
+AbstractPartially Observable Markov Decision Processes (POMDPs) provide a rich mathematical model to handle real-world sequential decision processes but require a known model to be solved by most approaches. However, mainstream POMDP research focuses on the discrete case and this complicates its application to most realistic problems that are naturally modeled using continuous state spaces. In this paper, we consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are unknown. We advocate the use of Gaussian Process Dynamical Models (GPDMs) so that we can learn the model through experience with the environment. Our results on the blimp problem show that the approach can learn good models of the sensors and actuators in order to maximize long-term rewards.

Anytime Self-play Learning to Satisfy Functional Optimality Criteria.
Andriy Burkov, Brahim Chaib-draa and , In Proceedings of 1st International Conference On Algorithmic Decision Theory (ADT'09), 446--457, 2009, (pdf) (bib).
+AbstractWe present an anytime multiagent learning approach to satisfy any given optimality criterion in repeated game self-play. Our approach is opposed to classical learning approaches for repeated games: namely, learning of equilibrium, Pareto-efficient learning, and their variants. The comparison is given from a practical (or engineering) standpoint, i.e., from a point of view of a multiagent system designer whose goal is to maximize the system's overall performance according to a given optimality criterion. Extensive experiments in a wide variety of repeated games demonstrate the efficacy of our approach.

Quasi-Deterministic Partially Observable Markov Decision Processes.
Camille Besse and Brahim Chaib-draa, In Proceedings of 16th International Conference On Neural Information Processing, 237--246, 2009, (pdf) (bib).
+AbstractWe study a subclass of POMDPs, called quasi-deterministic POMDPs (QDetPOMDPs), characterized by deterministic actions and stochastic observations. While this framework does not model the same general problems as POMDPs, they still capture a number of interesting and challenging problems and, in some cases, have interesting properties. By studying the observability available in this subclass, we show that QDetPOMDPs may fall many steps in the complexity classes of polynomial hierarchy.

A Markov Model for Multiagent Patrolling in Continuous Time.
Jean-Samuel Marier, Camille Besse, Brahim Chaib-draa and , In Proceedings of 16th International Conference On Neural Information Processing, 648--656, 2009, (pdf) (bib).
+AbstractWe present a model for the multiagent patrolling problem with continuous-time. An anytime and online algorithm is then described and extended to asynchronous multiagent decision processes. An online algorithm is also proposed for coordinating the agents. We finally compared our approach empirically to existing methods.

Learning Gaussian Process Models from Uncertain Data.
Patrick Dallaire, Camille Besse and Brahim Chaib-draa, In Proceedings of 16th International Conference On Neural Information Processing, 433--440, 2009, (pdf) (bib).
+AbstractIt is generally assumed in the traditional formulation of supervised learning that only the outputs data are uncertain. However,this assumption might be too strong for some learning tasks.This paper investigates the use of Gaussian Process prior to infer consistent models given uncertain data. By assuming a Gaussian distribution with known variances over the inputs and a Gaussian covariance function,it is possible to marginalize out the inputs’uncertainty and keep ananalytical posterior distribution over functions. We demonstrated the properties of the method on a synthetic problem and on a more realistic one, which consists in learning the dynamics of the well-known cart-pole problem and compare the performance versus a classic GaussianProcess. A large improvement of the mean squared error is presented as well as the consistency of the result of the regression.


2008


Articles in Referred Journals

Online Planning Algorithms for POMDPs.
Stéphane Ross, Joelle Pineau, Brahim Chaib-draa, Sébastien Paquet and , In Journal of AI Reserach (JAIR), 663--704, 2008, (pdf) (bib).
+AbstractPartially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP is often intractable except for small problems due to their complexity. Here, we focus on online approaches that alleviate the computational complexity by computing good local policies at each decision step during the execution. Online algorithms generally consist of a lookahead search to find the best action to execute at each time step in an environment. Our objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics (return, error bound reduction, lower bound improvement). Our experimental results indicate that state-of-the-art online heuristic search methods can handle large POMDP domains efficiently.

Spreadsheet vs Multiagent Based Simulations: The Case of Supply Chains.
Thierry Moyaux, Brahim Chaib-draa, Sophie D'Amours and , In International Journal of simulation and Process Modeling (IJPSM), 2008, (pdf) (bib).
+AbstractA game called the Quebec Wood Supply Game (QWSG) is a role-playing simulation based on the Beer Game for teaching supply chain dynamics, and, in particular, the bullwhip effect. In this context, this paper describes and compares two simulators based on the QWSG which may be used to study decision making and its impact on supply chain dynamics. We first focus on the direct implementation of the QWSG in a spreadsheet program. This spreadsheet model is the base on which we next build a more complex MultiAgent Based Simulation (MABS) in which JACKTM agents represent companies. Finally, we compare the respective advantages of each simulator. We identify the features of a supply chain model making a spreadsheet simulation impossible, and those for which a spreadsheet simulation is better, as good as, or worse than MABS.

Book Chapters

Une introduction aux jeux stochastiques.
Andriy Burkov and Brahim Chaib-draa, In Processus décisionnels de Markov en intelligence artificielle, Hermès Science - Lavoisier, 135--178, 2008, (pdf) (bib).

Articles in Referred Proceedings

Bayesian Reinforcement Learning in Continuous POMDPs with Application to Robot Navigation.
Stéphane Ross, Joelle Pineau, Brahim Chaib-draa and , In Proceedings of the 2008 IEEE International Conference on Robotics and Automation (ICRA'08), 2008, (pdf) (bib).
+AbstractWe consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are not known exactly. Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical model to handle such environments but require a known model to be solved by most approaches. This is a limitation in practice as the exact model parameters are often difficult to specify exactly. We adopt a Bayesian approach where a posterior distribution over the model parameters is maintained and updated through experience with the environment. We propose a particle filter algorithm to maintain the posterior distribution and an online planning algorithm, based on trajectory sampling, to plan the best action to perform under the current posterior. The resulting approach selects control actions which optimally trade-off between 1) exploring the environment to learn the model, 2) identifying the system’s state, and 3) exploiting its knowledge in order to maximize longterm rewards. Our preliminary results on a robot navigation problem shows that our approach is able to learn good models of the sensors and actuators and performs as well as if it had the true model.

Approximation de politiques par renforcement et classification.
Julien Laumonier, Brahim Chaib-draa and , In 16e congrès francophone AFRIF-AFIA, Reconnaissance des Formes et Intelligence Artificielle - Récompensé comme meilleur papier -, 2008, (pdf). Prix de la meilleure contribution IA (bib).
+AbstractLa plupart du temps, l’apprentissage par renforcement exact ne permet pas de reacute;soudre efficacement les applications r´eelles de tr`es grandes tailles. D`es lors, les recherches se sont int´eress´ees aux m´ethodes d’approximation. Certaines utilisent une fonction de valeur approxim´ee alors que d’autres se concentrent sur les approches de gradient de politique. Parall`element `a c¸a, de nombreux concepts d’approximation ont ´et´e ´etudi´es par la communaut´e de l’apprentissage supervis´e. Il est apparu clairement qu’il fallait lier les deux formes d’apprentissage et ainsi, r´ecemment, des approches faisant le lien entre l’apprentissage par renforcement et l’apprentissage supervis´e par l’approximation de la politique optimale par classification, ont fait leur apparition. Dans cet article, nous proposons un algorithme qui combine le Q-Learning, les s´eparateurs `a vaste marge (SVM) et le vote de majorit´e pour approximer la politique optimale d’un Processus D´ecisionnel Markovien. Nous ´etudions l’impact des param`etres r´esultant de la combinaison des approches sur la performance d’apprentissage et nous montrons empiriquement que notre algorithme est capable d’apprendre une politique proche de l’optimale mˆeme avec un grand nombre d’´etats `a traiter.

Parallel Rollout for Online Solution of Dec-POMDP.
Camille Besse and Brahim Chaib-draa, In Proceedings of 21st International FLAIRS Conference, 619--624, 2008, (pdf) (bib).
+AbstractA major research challenge is presented by scalability of algorithms for solving decentralized POMDPs because of their double exponential worst-case complexity for finite horizon problems. First algorithms have only been able to solve very small instances on very small horizons. One exception is the Memory-Bounded Dynamic Programming algorithm -an approximation technique that has proved efficient in handling same sized problems but on large horizons. In this paper, we propose an online algorithm that also approximates larger instances of finite horizon DEC-POMDPs based on the Rollout algorithm. To evaluate the effectiveness of this approach, we compare the presented approach to a recently proposed algorithm called memory bounded dynamic programming. Experimental results show that despite the very high complexity of DEC-POMDPs, the combination of Rollout techniques and estimation techniques performs well and leads to a significant improvement of existing approximation techniques.

State Space Compression with Predictive Representations.
A. Boularias, M. Izadi, B. Chaib-draa and , In Proceedings of the Twenty-First International Florida Artificial Intelligence Research Society Conference (FLAIRS'08), 41--46, 2008, (pdf) (bib).
+AbstractCurrent studies have demonstrated that the representational power of predictive state representations (PSRs) is at least equal to the one of partially observable Markov decision processes (POMDPs). This is while early steps in planning and generalization with PSRs suggest substantial improvements compared to POMDPs. However, lack of practical algorithms for learning these representations severely restricts their applicability. The computational inefficiency of exact PSR learning methods naturally leads to the exploration of various approximation methods that can provide a good set of core tests through less computational effort. In this paper, we address this problem in an optimization framework. In particular, our approach aims to minimize the potential error that may be caused by missing a number of core tests. We provide analysis of the error caused by this compression and present an empirical evaluation illustrating the performance of this approach.

Exact Dynamic Programming for Decentralized POMDPs with Lossless Policy Compression.
A. Boularias, B. Chaib-draa and , In Proceedings of the Eighteenth International Conference on Automated Planning and Scheduling (ICAPS'08), 2008, (pdf). To appear (bib).
+AbstractHigh dimensionality of belief space in DEC-POMDPs is one of the major causes that makes the optimal joint policy computation intractable. The belief state for a given agent is a probability distribution over the system states and the policies of other agents. Belief compression is an efficient POMDP approach that speeds up planning algorithms by projecting the belief state space to a low-dimensional one. In this paper, we introduce a new method for solving DEC-POMDP problems, based on the compression of the policy belief space. The reduced policy space contains sequences of actions and observations that are linearly independent. We tested our approach on two benchmark problems, and the preliminary results confirm that Dynamic Programming algorithm scales up better when the policy belief is compressed.

Planning in Decentralized POMDPs with Predictive Policy Representations.
A. Boularias, B. Chaib-draa and , In Proceedings of ICAPS'08 Multiagent Planning Workshop (MASPLAN'08), 2008, (pdf). To appear (bib).
+AbstractWe discuss the problem of policy representation in stochastic and partially observable systems, and address the case where the policy is a hidden parameter of the planning problem. We propose an adaptation of the Predictive State Representations (PSRs) to this problem by introducing tests (sequences of actions and observations) on policies. The new model, called the Predictive Policy Representations (PPRs), is more compact and uses less parameters than the usual representations, such as decision trees or Finite-State Controllers (FSCs). In this paper, we show how PPRs can be used to improve the performances of a point-based algorithm for DEC-POMDP.

Distributed Planning in Stochastic Games with Communication.
Andriy Burkov and Brahim Chaib-draa, In Proceedings of ICAPS'08 Multiagent Planning Workshop (MASPLAN'08), 2008, (pdf). (This is a preliminary workshop version of the paper. The final version is published in Proceedings of ICMLA'08) (bib).
+AbstractThis paper treats the problem of distributed planning in general-sum stochastic games with communication when the model is known. Our main contribution is a novel, game theoretic approach to the problem of distributed equilibrium computation and selection. We show theoretically and via experimentations that our approach to multiagent planning, when adopted by all agents, facilitates an efficient distributed equilibrium computation and leads to a unique equilibrium selection in general-sum stochastic games with communication.

Prediction-directed Compression of POMDPs.
Abdeslam Boularias, Masoumeh Izadi, Brahim Chaib-draa and , In Proceedings of the International Conference on Machine Learning and Applications (ICMLA'08, 2008, (pdf) (bib).
+AbstractHigh dimensionality of belief space in Partially Observable Markov Decision Processes (POMDPs) is one of the major causes that severely restricts the applicability of this model. Previous studies have demonstrated that the dimensionality of a POMDP can eventually be reduced by transforming it into an equivalent Predictive State Representation (PSR). In this paper, we address the problem of finding an approximate and compact PSR model corresponding to a given POMDP model. We formulate this problem in an optimization framework. Our algorithm tries to minimize the potential error that missing some core tests may cause. We also present an empirical evaluation on benchmark problems, illustrating the performance of this approach.

A Predictive Model for Imitation Learning in Partially Observable Environments.
Abdeslam Boularias and , In Proceedings of the International Conference on Machine Learning and Applications (ICMLA'08), 2008, (pdf) (bib).
+AbstractLearning by imitation has shown to be a powerful paradigm for automated learning in autonomous robots. This paper presents a general framework of learning by imitation for stochastic and partially observable systems. The model is a Predictive Policy Representation (PPR) whose goal is to represent the teacher’s policies without any reference to states. The model is fully described in terms of actions and observations only. We show how this model can efficiently learn the personal behavior and preferences of an assistive robot user.

Recherche incrémentale à base de points pour la résolution des DEC-POMDPs.
Jilles S. Dibangoye, Abdel-Illah Mouaddib, Brahim Chaib-draa and , In Les Actes des 15es JFSMA, 2008, (pdf) (bib).
+AbstractNous nous int?ssons au probl? du contr? d’un processus d?sionnel de Markov d?ntralis?et partiellement observ?DEC-POMDP) ?horizon fini. Nous introduisons une nouvelle approche heuristique qui s’appuie sur les observations suivantes : (1) l’op?tion ?mentaire de programmation dynamique, consistant ?a g?ration exhaustive et l’?luation de toutes les politiques jointes, est extr?ment prohibitive ; (2) bon nombre des politiques jointes ainsi g?r? sont inutiles pour un contr?optimal ou presqu’optimal. Suivant ces observations, nous proposons la premi? technique de construction incr?ntale de politiques jointes ?ase d’?ts de croyance, PBIP, permettant d’?ter ces calculs intensifs. L’algorithme PBIP surpasse les performances des meilleurs techniques approximatives actuelles sur de nombreux exemples de la litt?ture.

Planification à base d'ordres topologiques pour la résolution des POMDPs.
Jilles S. Dibangoye, Brahim Chaib-draa, Abdel-Illah Mouaddib and , In Les Actes des 3es JFPDA, 2008, (pdf) (bib).
+AbstractBien que les processus d?sionnels de Markov partiellement observables (POMDPs) aient re?beaucoup d’attention au cours des derni?s ann?, ?e jour, la r?lution des probl?s r?s reste un s?eux d?. Dans ce contexte, les techniques d’acc?ration des m?odes fondamentales ont ? un des principaux axes de recherche. Parmi elles, les algorithmes ordonn?sugg?nt des solutions au probl? de l’ordre des mises ?our de la fonction de valeur. Ces techniques permettent notamment de r?ire consid?blement le nombre de mises ?our requises, mais en contrepartie, elles impliquent des co?suppl?ntaires. Dans ce papier, nous pr?ntons une nouvelle approche ordonn?: la planification ?ase d’ordres topologiques (TOP). Cette approche exploite les relations de causalit?ntre ?ts afin de faire face ?eux probl?tiques : (1) la d?ction de la structure d’un POMDP comme moyen de surmonter ?a fois la mal?ction de la dimension et celle de l’historique ; (2) la r?ction des mises ?our inutiles et le d?rminisme d’une fonction de valeur approximative sur la base d’ordres topologiques induits par la structure sous-jacente du POMDP. Les exp?mentations prouvent que TOP est comp?tif comparativement aux meilleurs algorithmes actuels. Mots-cl?: POMDPs, Prise de d?sion s?entielle, Ordre topologique.

A Novel Prioritization Technique for Solving Markov Decision Processes.
Jilles S. Dibangoye, Brahim Chaib-draa, Abdel-Illah Mouaddib and , In Proceedings of 21st International FLAIRS Conference, 2008, (pdf) (bib).
+AbstractWe address the problem of computing an optimal value function for Markov decision processes. Since finding this function quickly and accurately requires substantial computation effort, techniques that accelerate fundamental algorithms have been a main focus of research. Among them prioritization solvers suggest solutions to the problem of ordering backup operations. Prioritization techniques for ordering the sequence of backup operations reduce the number of needed backups considerably, but involve significant overhead. This paper provides a new way to order backups, based on a mapping of states space into a metric space. Empirical evaluation verifies that our method achieves the best balance between the number of backups executed and the effort required to prioritized backups, showing order of magnitude improvement in runtime over number of benchmarks.

Incremental Pruning Heuristic for Solving DEC-POMDPs.
Jilles S. Dibangoye, Abdel-Illah Mouaddib and Brahim Chaib-draa, In Proceedings of the Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains (MSDM'08), 2008 (bib).

Distributed Planning in Stochastic Games with Communication.
Andriy Burkov, Brahim Chaib-draa and , In Proceedings of The 2008 International Conference on Machine Learning and Applications (ICMLA'08), 2008, (pdf). (This is a final version of preliminary results published in Proceedings of MASPLAN'08.) (bib).
+AbstractThis paper treats the problem of distributed planning in general-sum stochastic games with communication when the model is known. Our main contribution is a novel, game theoretic approach to the problem of distributed equilibrium computation and selection. We show theoretically and via experiments that our approach, when adopted by all agents, facilitates an efficient distributed equilibrium computation and leads to a unique equilibrium selection in general-sum stochastic games with communication.


2007


Articles in Referred Journals

Information Sharing as a Coordination Mechanism for Reducing the Bullwhip Effect in a Supply Chain.
Thierry Moyaux, Brahim Chaib-draa, Sophie D'Amours and , In IEEE Transactions on Systems, Man, and Cybernetics-Part C (SMC-C), 396--409, 2007, (pdf) (bib).
+AbstractThe bullwhip effect is an amplification of the variability of the orders placed by companies in a supply chain. This variability reduces the efficiency of supply chains, since it incurs costs due to higher inventory levels and supply chain agility reduction. Eliminating the bullwhip effect is surely simple; every company just has to order following the market demand, i.e., each company should use a lot-for-lot type of ordering policy. However, many reasons, such as inventory management, lot-sizing, and market, supply, or operation uncertainties, motivate companies not to use this strategy. Therefore, the bullwhip effect cannot be totally eliminated. However, it can be reduced by information sharing, which is the form of collaboration considered in this paper. More precisely, we study how to separate demand into original demand and adjustments.We describe two principles explaining how to use the shared information to reduce the amplification of order variability induced by lead times, which we propose as a cause of the effect. Simulations confirm the value of these two principles with regard to costs and customer service levels.

Multiagent Coordination Techniques For Complex Environments: The Case of a Fleet of Combat Ships.
Patrick Beaumont, Brahim Chaib-draa and , In IEEE Transaction on Systems, Man and Cybernetics-Part C, (SMC-C), 373--384, 2007, (pdf) (bib).
+AbstractThe use of agent and multiagent techniques to assist humans in their daily routines has been increasing for many years, notably in Command and Control (C2) systems. In this context, we propose using multiagent planning and coordination techniques for resources management in real-time C2 systems. The particular problem we studied is the design of a decision-support for Anti-Air Warfare (AAW) on combat ships. In this paper, we refer to the specific case of several combat ships defending against incoming threats and where coordination of their respective resources is a complex problem of capital importance. Efficient coordination mechanisms between the different combat ships are then important to avoid redundancy in engagements and inefficient defence caused by the conflicting actions. To this end, we present four different coordination mechanisms based on task sharing. Three of these mechanisms are communication-based: central coordination, contract Net coordination and ∼Brown coordination, while the last one is a zone defence coordination and is based on conventions. Finally, we expose the results obtained while simulating these various mechanisms.

Conversational Semantics with Social Commitments.
Roberto A. Flores, Philippe Pasquier, Brahim Chaib-draa and , In Journal of Autonomous Agent and Multi-Agent Systems, 165--186, 2007, (pdf) (bib).
+AbstractWe propose an operational model that combines message meaning and conversational structure in one comprehensive approach. Our long-term research goal is to lay down principles uniting message meaning and conversational structure while providing an operational foundation that could be implemented in open computer systems. In this paper we explore our advances in one aspect of meaning that in theories of language use is known as “signal meaning”, and propose a layered model in which the meaning of messages can be defined according to their fitness to advance the state of joint activities. Messages in our model are defined in terms of social commitments, which have been shown to entice conversational structure.

Articles in Referred Proceedings

Learning to Play a Satisfaction Equilibria.
Stéphane Ross, Brahim Chaib-draa and , In Evolutionary Models of Collaboration (EM C'07) Workshop of Int. Joint Conf. on AI (IJCAI'07), 2007, (pdf) (bib).
+AbstractIn real life problems, agents are generally faced with situations where they only have partial or no knowledge about their environment and the other agents evolving in it. In this case all an agent can do is reasoning about its own payoffs and it cannot rely on the classical equilibria through deliberation. To palliate to this difficulty, we introduce the satisfaction principle from which an equilibrium can arise as the result of the agents individual learning experiences. We define such an equilibrium and then we present different algorithms that can be used to reach it. Finally, we present experimental results and theoretical proofs that show that using learning strategies based on this specific equilibrium, agents will generally coordinate themselves on a Paretooptimal joint strategy, that is not always a Nash equilibrium, even though each agent is individually rational, in the sense that they try to maximize their own satisfaction.

AEMS: An Anytime Online Search Algorithm for Approximate Policy Refinement in Large POMDPs.
Stéphane Ross, Brahim Chaib-draa and , In Proc. of Int. Joint Conf. on AI (IJCAI'07), 2007, (pdf) (bib).
+AbstractSolving large Partially Observable Markov Decision Processes (POMDPs) is a complex task which is often intractable. A lot of effort has been made to develop approximate offline algorithms to solve ever larger POMDPs. However, even stateofthe-art approaches fail to solve large POMDPs in reasonable time. Recent developments in online POMDP search suggest that combining offline computations with online computations is often more efficient and can also considerably reduce the error made by approximate policies computed offline. In the same vein, we propose a new anytime online search algorithm which seeks to minimize, as efficiently as possible, the error made by an approximate value function computed offline. In addition, we show how previous online computations can be reused in following time steps in order to prevent redundant computations. Our preliminary results indicate that our approach is able to tackle large state space and observation space efficiently and under real-time constraints.

Agent Neighbourhood for Learning Approximated Policies in DEC-MDP.
Julien Laumonier, Brahim Chaib-draa and , In Evolutionary Models of Collaboration (EMC'07) Workshop of Int. Joint Conf. on AI (IJCAI'07), 2007, (pdf) (bib).
+AbstractResolving multiagent team decision problems, where agents share a common goal, is challenging since the number of states and joint actions is exponential with the number of agents. Even if the resolution of such problems is theoretically possible via models such as DEC-MDP, it is often intractable. In this context, it is important to find a good approximated policy without high resolution complexity in the case of a team of agents. To this end, we propose in this article to introduce the notion of an agent’s neighbourhood as an approximation of the observable problem in terms of visible states, visible joint actions and visible rewards available to each agent. We present an algorithm based on Q-values where actions and states are function of neighbouring agents and present results which approximate the optimal solution in the context ofMultiSysAdmin. We show that the value of the approximated policy stays close to the optimal when the distance of neighbourhoodwithdraws from the optimal distance. Moreover, we show that partial rewards can improve the value of the approximated joint policy mostly for problem sizes which are difficult to solve without partial rewards.

Tight Bounds for a Stochastic Resource Allocation Algorithm Using Marginal Revenue.
Pierrick Plamondon, Brahim Chaib-draa, Abder Rezak Benaskeur and , In Proceedings of the AAAI 2007 Spring Symposium on Decision Theoretic and Game Theoretic Agents (GTDT'2007), 2007, (pdf) (bib).
+AbstractThis paper contributes to solve effectively stochastic resource allocation problems known to be NP-Complete. To address this complex resource management problem, previous works on pruning the action space of real-time heuristic search is extended. The pruning is accomplished by using upper and lower bounds on the value function. This way, if an action in a state has its upper bound lower than the lower bound on the value of this state, this action may be pruned in the set of possible optimal actions for the state. This paper extends this previous work by proposing tight bounds for problems where tasks have to be accomplished using limited resources. The marginal revenue bound proposed in this paper compares favorably with another approach which proposes bounds for pruning the action space.

A Real-time Dynamic Programming Decomposition Approach to Resource Allocation.
Pierrick Plamondon, Brahim Chaib-draa, Abder Rezak Benaskeur and , In Proceedings of the Information, Decision and Control (IDC 2007), 2007, (pdf) (bib).
+AbstractThis paper contributes to solve effectively stochastic resource allocation problems known to be NP-Complete. To address this complex resource management problem, the merging of two approaches is made: The Q-decomposition model, which coordinates reward separated agents through an arbitrator, and the Labeled Real-Time Dynamic Programming (LRTDP) approaches are adapted in an effective way. The Qdecomposition permits to reduce the set of states to consider, while LRTDP concentrates the planning on significant states only. As demonstrated by the experiments, combining these two distinct approaches permits to further reduce the planning time to obtain the optimal solution of a resource allocation problem.

Online Policy Improvement in Large POMDPs via an Error Minimization Search.
Stéphane Ross, Joelle Pineau, Brahim Chaib-draa and , In Proceedings of the 2nd North East Student Colloquium on Artificial Intelligence (NESCAI), 2007, (pdf) (bib).
+AbstractPartially Observable Markov Decision Processes (POMDPs) provide a rich mathematical framework for planning under uncertainty. However, most real world systems are modelled by huge POMDPs that cannot be solved due to their high complexity. To palliate to this difficulty, we propose combining existing offline approaches with an online search process, called AEMS, that can improve locally an approximate policy computed offline, by reducing its error and providing better performance guarantees. We propose different heuristics to guide this search process, and provide theoretical guarantees on the convergence to ǫ-optimal solutions. Our experimental results show that our approach can provide better solution quality within a smaller overall time than state-of-the-art algorithms and allow for interesting online/offline computation tradeoff.

A Q-decomposition and Bounded RTDP Approach to Resource Allocation.
Pierrick Plamondon, Brahim Chaib-draa, Abder Rezak Benaskeur and , In Proceedings of the 2007 International Conference on Autonomous Agents and Multiagent Systems (AAMAS'07), 2007, (pdf) (bib).
+AbstractThis paper contributes to solve effectively stochastic resource allocation problems known to be NP-Complete. To address this complex resource management problem, a Qdecomposition approach is proposed when the resources which are already shared among the agents, but the actions made by an agent may influence the reward obtained by at least another agent. The Q-decomposition allows to coordinate these reward separated agents and thus permits to reduce the set of states and actions to consider. On the other hand, when the resources are available to all agents, no Qdecomposition is possible and we use heuristic search. In particular, the bounded Real-time Dynamic Programming (bounded rtdp) is used. Bounded rtdp concentrates the planning on significant states only and prunes the action space. The pruning is accomplished by proposing tight upper and lower bounds on the value function.

Urban Traffic Control Based on Learning Agents.
Pierre-Luc Grégoire, Charles Desjardins, Julien Laumonier, Brahim Chaib-draa and , In Proceedings of the 10th Internationnal IEEE Conference on Intelligent Transportation Systems (ITSC'07), 2007, (pdf) (bib).
+AbstractThe optimization of traffic light control systems is at the heart of work in traffic management. Many of the solutions considered to design efficient traffic signal patterns rely on controllers that use pre-timed stages. Such systems are unable to identify dynamic changes in the local traffic flow and thus cannot adapt to new traffic conditions. An alternative, novel approach proposed by computer scientists in order to design adaptive traffic light controllers relies on the use of intelligents agents. The idea is to let autonomous entities, named agents, learn an optimal behavior by interacting directly in the system. By using machine learning algorithms based on the attribution of rewards according to the results of the actions selected by the agents, we can obtain a control policy that tries to optimize the urban traffic flow. In this paper, we will explain how we designed an intelligent agent that learns a traffic light control policy. We will also compare this policy with results from an optimal pre-timed controller.

Architecture and Design of a Multi-Layered Cooperative Cruise Control System.
Charles Desjardins, Pierre-Luc Grégoire, Julien Laumonier and Brahim Chaib-draa, In Proceedings of the SAE World Congress, 2007 (bib).

Bayes-Adaptive POMDPs.
Stéphane Ross, Brahim Chaib-draa, Joelle Pineau and , In Proceedings of the 21st conference on Neural Information Processing Systems (NIPS'07), 2007, (pdf). Un rapport complet incluant les preuves suit. (bib).
+AbstractBayesian Reinforcement Learning has generated substantial interest recently, as it provides an elegant solution to the exploration-exploitation trade-off in reinforcement learning. However most investigations of Bayesian reinforcement learning to date focus on the standard Markov Decision Processes (MDPs). Our goal is to extend these ideas to the more general Partially Observable MDP (POMDP) framework, where the state is a hidden variable. To address this problem, we introduce a new mathematicalmodel, the Bayes-Adaptive POMDP. This new model allows us to (1) improve knowledge of the POMDP domain through interaction with the environment, and (2) plan optimal sequences of actions which can tradeoff between improving the model, identifying the state, and gathering reward. We show how the model can be finitely approximatedwhile preserving the value function. We describe approximations for belief tracking and planning in this model. Empirical results on two domains show that the model estimate and agent’s return improve over time, as the agent learns better model estimates.

Theoretical Analysis of Heuristic Search Methods for Online POMDPs.
Stéphane Ross, Joelle Pineau, Brahim Chaib-draa and , In Proceedings of the 21st conference on Neural Information Processing Systems (NIPS'07), 2007, (pdf) (bib).
+AbstractPlanning in partially observable environments remains a challenging problem, despite significant recent advances in offline approximation techniques. A few online methods have also been proposed recently, and proven to be remarkably scalable, but without the theoretical guarantees of their offline counterparts. Thus it seems natural to try to unify offline and online techniques, preserving the theoretical properties of the former, and exploiting the scalability of the latter. In this paper, we provide theoretical guarantees on an anytime algorithm for POMDPs which aims to reduce the error made by approximate offline value iteration algorithms through the use of an efficient online searching procedure. The algorithm uses search heuristics based on an error analysis of lookahead search, to guide the online search towards reachable beliefs with themost potential to reduce error. We provide a general theorem showing that these search heuristics are admissible, and lead to complete and ǫ-optimal algorithms. This is, to the best of our knowledge, the strongest theoretical result available for online POMDP solution methods. We also provide empirical evidence showing that our approach is also practical, and can find (provably) near-optimal solutions in reasonable time.

A Markovian Model for Dynamic and Constrained Resource Allocation Problems.
Camille Besse, Brahim Chaib-draa and , In Proceedings of the 22nd AAAI Conference on Artificial Intelligence, July 2007, Vancouver, BC, Canada, 1846--1847, 2007, (pdf) (bib).
+AbstractAn autonomous agent, allocating stochastic resources to incoming tasks, faces increasingly complex situations when formulating its control policy. These situations are often constrained by limited resources of the agent, time limits, physical constraints or other agents. All these reasons explain why complexity and state space dimension increase exponentially in size of considered problem. Unfortunately, models that already exist either consider the sequential aspect of the environment, or its stochastic one or its constrained one. To the best of our knowledge, there is no model that take into account all these three aspects. For example, dynamic constraint satisfaction problems (DCSP) have been introduced by Dechter & Dechter (1988) to address dynamic and constrained problems. However, in DCSPs, there is typically no transition model, and thus no concept of sequence of controls. On the other hand, Fargier, Lang, & Schiex (1996) proposed mixed CSPs (MCSPs), but this approach considers only the stochastic and the constrained aspects of the problem. In this paper, we introduce a new model based on DCSPs and Markov decision processes to address constrained stochastic resource allocation (SRA) problems by using expressiveness and powerfulness of CSPs. We thus propose a framework which aims to model dynamic and stochastic environments for constrained resources allocation decisions and present some complexity and experimental results.

An Efficient Model for Dynamic and Constrained Resource Allocation Problems.
Camille Besse and Brahim Chaib-draa, In Proceedings of the 2nd International Workshop on Constraint Satisfaction Techniques for Planning and Scheduling Problems (COPLAS'07), 9--16, 2007, (pdf) (bib).
+AbstractDynamic constraint satisfaction is a useful tool for representing and solving sequential decision problems with complete knowledge in dynamic world and particularly constrained resource allocation problems. However, when resources are unreliable, this framework becomes limited due to the stochastic outcomes of the assignments chosen. On the contrary, Markov Decision Processes (MDPs) handle stochastic outcomes of unreliable actions, but their complexity explodes when using state-defined constraints. We thus propose an extension of the MDP framework so as to represent constrained and stochastic actions in sequential decision making. The basis of this extension consists in modeling the evolution of a dynamic constraint network by a MDP. We first study the complexity of the problem of finding an optimal policy for this model and then we propose an algorithm for solving it. Comparison to standard MDP shows that this framework noticeably improves policy computation.

R-FRTDP: A Real-Time DP Algorithm with Tight Bounds for a Stochastic Resource Allocation Problem.
Camille Besse, Pierrick Plamondon and Brahim Chaib-draa, In Proceedings of the 20th Canadian Conference on Artificial Intelligence (AI'2007), 50--60, 2007, (pdf) (bib).
+AbstractResource allocation is a widely studied class of problems in Operation Research and Artificial Intelligence. Specially, constrained stochastic resource allocation problems, where the assignment of a constrained resource do not automatically imply the realization of the task. This kind of problems are generally addressed with Markov Decision Processes (mdps). In this paper, we present efficient lower and upper bounds in the context of a constrained stochastic resource allocation problem for a heuristic search algorithm called Focused Real Time Dynamic Programming (frtdp). Experiments show that this algorithm is relevant for this kind of problems and that the proposed tight bounds reduce the number of backups to perform comparatively to previous existing bounds.

Les Représentations Prédictives des états et des Politiques.
A. Boularias, B. Chaib-draa and , In Actes des Quatrièmes Journées Francophones Modèles Formels de l'Interaction (MFI'07), 37--48, 2007, (pdf) (bib).
+AbstractNous proposons dans cet article une nouvelle approche pour repr?nter les politiques (strat?es) dans les environnements stochastiques et partiellement observables. Nous nous int?ssons plus particuli?ment aux syst?s multi-agents, o? chaque agent conna?t uniquement ses propres politiques, et doit choisir la meilleure parmi elles selon son ?t de croyance sur les politiques du reste des agents. Notre mod? utilise moins de param?es que les m?odes de repr?ntation usuelles, telles que les arbres de d?sion ou les controleurs d'?ts finis stochastiques, permettant ainsi une acc?ration des algorithmes de planification. Nous montrons aussi comment ce mod? peut ^etre utilis?fficacement dans le cas de la planification multiagents coop?tive et sans communication, les r?ltats empiriques sont compar?avec le mod? DEC-POMDP (Decentralized Partially Observable Markov Decision Process).

Multiagent Learning in Adaptive Dynamic Systems.
Andriy Burkov and Brahim Chaib-draa, In Proceedings of the 2007 International Conference on Autonomous Agents and Multiagent Systems (AAMAS'07), 2007, (pdf) (bib).
+AbstractClassically, an approach to the multiagent policy learning supposed that the agents, via interactions and/or by using preliminary knowledge about the reward functions of all players, would find an interdependent solution called ``equilibrium''. Recently, however, certain researchers question the necessity and the validity of the concept of equilibrium as the most important multiagent solution concept. They argue that a ``good'' learning algorithm is one that is efficient with respect to a certain class of counterparts. Adaptive players is an important class of agents that learn their policies separately from the maintenance of the beliefs about their counterparts' future actions and make their decisions based on that policy and the current belief. In this paper, we propose an efficient learning algorithm in presence of the adaptive counterparts called Adaptive Dynamics Learner (ADL), which is able to learn an efficient policy over the opponents' adaptive dynamics rather than over the simple actions and beliefs and, by so doing, to exploit these dynamics to obtain a higher utility than any equilibrium strategy can provide. We tested our algorithm on a substantial representative set of the most known and demonstrative matrix games and observed that ADL agent is highly efficient against Adaptive Play Q-learning (APQ) agent and Infinitesimal Gradient Ascent (IGA) agent. In self-play, when possible, ADL is able to converge to a Pareto optimal strategy maximizing the welfare of all players.

Labeled Initialized Adaptive Play Q-learning for Stochastic Games.
Andriy Burkov and Brahim Chaib-draa, In Proceedings of the AAMAS'07 Workshop on Adaptive and Learning Agents (ALAg'07), 2007, (pdf) (bib).
+AbstractRecently, initial approximation of Q-values of the multiagent Q-learning by the optimal single-agent Q-values has shown good results in reducing the complexity of the learning process. In this paper, we continue in the same vein and give a brief description of the Initialized Adaptive Play Q-learning (IAPQ) algorithm while establishing an effective stopping criterion for this algorithm. To do that, we adapt a technique called ``labeling'' to the multiagent learning context. Our approach demonstrates good empirical behavior in multiagent coordination problems, such as two-robot grid world stochastic game. We show that our Labeled IAPQ (i) is able to converge faster than IAPQ by permitting a certain predefined value of learning error and (ii) it establishes an effective stopping criterion, which permits terminating the learning process at a near-optimal point with a flexible learning speed/quality tradeoff.

Competition and Coordination in Stochastic Games.
Andriy Burkov, Abdeslam Boularias and Brahim Chaib-draa, In Proceedings of the 2007 Twentieth Canadian Conference on Artificial Intelligence (CanAI'07), 2007, (pdf) (bib).
+AbstractAgent competition and coordination are two classical and most important tasks in multiagent systems. In recent years, there was a number of learning algorithms proposed to resolve such type of problems. Among them, there is an important class of algorithms, called adaptive learning algorithms, that were shown to be able to converge in self-play to a solution in a wide variety of the repeated matrix games. Although certain algorithms of this class, such as Infinitesimal Gradient Ascent (IGA), Policy Hill-Climbing (PHC) and Adaptive Play Q-learning (APQ), have been catholically studied in the recent literature, a question of how these algorithms perform versus each other in general form stochastic games is remaining little-studied. In this work we are trying to answer this question. To do that, we analyse these algorithms in detail and give a comparative analysis of their behavior on a set of competition and coordination stochastic games. Also, we introduce a new multiagent learning algorithm, called ModIGA. This is an extension of the IGA algorithm, which is able to estimate the strategy of its opponents in the cases when they do not explicitly play mixed strategies (e.g., APQ) and which can be applied to the games with more than two actions.

Effective Learning in Adaptive Dynamic Systems.
Andriy Burkov and Brahim Chaib-draa, In Proceedings of the AAAI 2007 Spring Symposium on Decision Theoretic and Game Theoretic Agents (GTDT'07), 2007, (pdf) (bib).
+AbstractClassically, an approach to the policy learning in multiagent systems supposed that the agents, via interactions and/or by using preliminary knowledge about the reward functions of all players, would find an interdependent solution called ``equilibrium''. Recently, however, certain researchers question the necessity and the validity of the concept of equilibrium as the most important multiagent solution concept. They argue that a ``good'' learning algorithm is one that is efficient with respect to a certain class of counterparts. Adaptive players is an important class of agents that learn their policies separately from the maintenance of the beliefs about their counterparts' future actions and make their decisions based on that policy and the current belief. In this paper we propose an efficient learning algorithm in presence of the adaptive counterparts called Adaptive Dynamics Learner (ADL) which is able to learn an efficient policy over the opponents' adaptive dynamics rather than over the simple actions and beliefs and, by so doing, to exploit this dynamics to obtain a higher utility than any equilibrium strategy can provide. We tested our algorithm on a big set of the most known and demonstrative matrix games and observed that ADL agent is highly efficient against Adaptive Play Q-learning (APQ) agent and Infinitesimal Gradient Ascent (IGA) agent. In self-play, when possible, ADL is able to converge to a Pareto optimal strategy that maximizes the welfare of all players instead of an equilibrium strategy.

Adaptive Play Q-Learning with Initial Heuristic Approximation.
Andriy Burkov and Brahim Chaib-draa, In Proceedings of the 2007 IEEE International Conference on Robotics and Automation (ICRA'07), 2007, (pdf) (bib).
+AbstractThe problem of an effective coordination of multiple autonomous robots is one of the most important tasks of the modern robotics. In turn, it is well known that the learning to coordinate multiple autonomous agents in a multiagent system is one of the most complex challenges of the state-of-the-art intelligent system design. Principally, this is because of the exponential growth of the environment's dimensionality with the number of learning agents. This challenge is known as ``curse of dimensionality'', and relates to the fact that the dimensionality of the multiagent coordination problem is exponential in the number of learning agents, because each state of the system is a joint state of all agents and each action is a joint action composed of actions of each agent. In this paper, we address this problem for the restricted class of environments known as goal-directed stochastic games with action-penalty representation. We use a single-agent problem solution as a heuristic approximation of the agents' initial preferences and, by so doing, we restrict to a great extent the space of multiagent learning. We show theoretically the correctness of such an initialization, and the results of experiments in a well-known two-robot grid world problem show that there is a significant reduction of complexity of the learning process.

Periodic Real-Time Resource Allocation for Teams of Progressive Processing Agents.
Jilles S. Dibangoye, Abdel-Illah Mouaddib, Brahim Chaib-draa and , In Proceedings of The 6th International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS'07), 2007, (pdf) (bib).


2006


Articles in Referred Journals

DIAGAL : An agent communication language based on dialogues games and sustained by social commitments.
Marc-André Labrie, Mathieu Bergeron, Brahim Chaib-draa, Philippe Pasquier and , In Journal of Autonomous Agent and Multi-Agent Systems, 61--95, 2006, (pdf) (bib).
+AbstractIn recent years, social commitment based approaches have been proposed to solve problems issuing from previous mentalistic based semantics for agent communication languages. This paper follows the same line of thought since it presents the latest version of our dialogue game based agent communication language – DIAlogue-Game based Agent Language (DIAGAL) – which allows agents to manipulate the public layer of social commitments through dialogue, by creating, canceling and updating their social commitments. To make apparent such commitments, we consider here Agent Communication Language (ACL) from the dialectic point of view, where agents “play a game” based on commitments. Such games based on commitments are incorporated in the DIAGAL language, which has been developed having in mind the following questions: (a) What kind of structure does the game have? How are rules specified within the game? (b) What kind of games compositions are allowed? (c) How do participants in conversations reach agreement on the current game? How are games opened or closed? Using such games we show how we can study the commitments dynamic to model agent dialogue and we present metrics that can be used to evaluate the quality of a dialogue between agents. Next, we use an example (summer festival organization) to show how DIAGAL can be used in analyzing and modeling automated conversations in offices. Finally, we present the results and analysis of the summer festival simulations that we realized through our dialogue game simulator (DGS).

Prise de décision en temps-réel pour des POMDPs de grande taille.
Sébastien Paquet, Ludovic Tobin, Brahim Chaib-draa and , In Revue d'intelligence artificielle, 203--233, 2006, (pdf) (bib).
+AbstractThis paper presents a POMDP approximation method, called RTBSS (Real-Time Belief Space Search), which is based on a look-ahead search in order to plan in a real-time dynamic environment. The basis of our approach is to avoid computing full policies in POMDP problems. Our approach is especially motivated by real-time environments where the state space is too large to consider traditional offline algorithms. We then proceed with an online approach to find at each step, the action that maximize the agent expected utility. To this end, we present the formalism behind our approach. Then, we present how the approach was applied on three different environments: Tag, RockSample and the RoboCupRescue simulation. Let us mention finally, that the approach we present was successfully implemented for the RoboCupRescue 2004 international competition in Lisbon, Portugal where we finished in second position.

Performance of software agents in non-transferable payoff group buying.
Frederick Asselin, Brahim Chaib-draa and , In Journal of Experimental and Theoretical Artificial Intelligence, 1--32, 2006, (pdf) (bib).
+AbstractSoftware agents can be useful in forming buyers’ groups since humans have considerable difficulties in finding Pareto-optimal deals (no buyer can be better without another being worse) in negotiation situations. What are the computational and economical performances of software agents for a group buying problem? We have developed a negotiation protocol for software agents which we have evaluated to see if the problem is difficult on average and why. This protocol probably finds a Pareto-optimal solution and, furthermore, minimizes the worst distance to ideal among all software agents given strict preference ordering. This evaluation demonstrated that the performance of software agents in this group buying problem is limited by memory requirements (and not execution time complexity). We have also investigated whether software agents following the developed protocol have a different buying behaviour from that which the customer they represented would have had in the same situation. Results show that software agents have a greater difference of behaviour (and better behaviour since they can always simulate the obvious customer behaviour of buying alone their preferred product) when they have similar preferences over the space of available products. We also discuss the type of behaviour changes and their frequencies based on the situation.

Apprentissage de la coordination multiagent : une méthode basée sur le Q-learning par jeu adaptatif.
Olivier Gies, Brahim Chaib-draa and , In Revue d'intelligence artificielle, 385--412, 2006, (pdf) (bib).
+AbstractCurrent algorithmes on multiagent learning are for almost limited since they cannot manage the multiplicity of Nash equilibria and thus converge to the Pareto-optimal. To alleviate this, we propose here a learning mechanism extending the Q-learning to non-cooperative stochastique games. This learning mechanism converges to Pareto-optimal equilibria in selfplay. We present experimental results showing convergence of such learning mechanism. We then extend our approach to the case of non-stationarity of agents which is another important aspect of multiagent systems. Finally, we tackle the question of non-stationarity in multiagent environments in its generality and we present in this context some research avenues which can lead to improve our preliminary results on adaptation.

Book Chapters

Supply Chain Management and Multiagent Systems: An Overview.
Thierry Moyaux, Brahim Chaib-draa, Sophie D'Amours and , In MultiAgent-Based Supply Chain Management, Springer, 2006, (pdf) (bib).
+AbstractThis chapter introduces the topic of this book by presenting the fields of supply chain management, multiagent systems, and the merger of these two fields into multiagent-based supply chain management. More precisely, the problems encountered in supply chains and the techniques to address these problems are first presented. Multiagent systems are next broadly presented, before focusing on how agents can contribute to solving problems in supply chains.

Design, Implementation and Test of Collaborative strategies in the Supply Chain.
Thierry Moyaux, Brahim Chaib-draa, Sophie D'Amours and , In MultiAgent-Based Supply Chain Management, Springer, approx 450 p., 2006, (pdf) (bib).
+AbstractIn general, game theory is used to analyze interactions formally described by an analytical model. In this paper, we describe a methodology to replace the analytical model by a simulation one in order to study more realistic situations. We use this methodology to study how the more-or-less selfishness of agents affects their behaviour. We illustrate our methodology with the case study of a wood supply chain, in which every company is seen as an agent which may use an ordering strategy designed to reduce a phenomenon called the bullwhip effect. To this end, we assume that every agent utility can be split in two parts, a first part representing the direct utility of agents (in practice, their inventory holding cost) and a second part representing agent social consciousness, i.e., their impact on the rest of the multi-agent system (in practice, their backorder cost). We find that company-agents often apply their collaborative strategy at whatever their same level of social consciousness. Our interpretation of this specific case study is that every company is so strongly related with one other, that all should collaborate in our supply chain model. Note that a previous paper outlined this methodology and detailed its application to supply chains; our focus is now on the presentation and the extension of the methodology, rather than on its application to supply chains.

Edited Books

Multiagent-based Supply Chain Management.
Brahim Chaib-draa and Jörg Müller, Springer, 2006 (bib).

Articles in Referred Proceedings

Cooperative Adaptive Cruise Control: a Reinforcement Learning Approach.
Julien Laumonier, Charles Desjardins, Brahim Chaib-draa and , In 4th Workshop on Agents in Traffic And Transportation, AAMAS'06, 2006, (pdf) (bib).
+AbstractAs a part of Intelligent Transport Systems (ITS), Cooperative Adaptive Cruise Control (CACC) systems have been introduced for finding solutions to the modern problems of automotive transportation such as traffic efficiency, passenger comfort and security. To achieve cooperation, actors on the road must use internal sensors and communication. Designing such a controller is not an easy task when the problem is considered in its entirety, since the interactions taking place in the environment (from vehicle physics and dynamics to multi-vehicle interaction) are extremely complex and hard to model formally. That is why many ITS approaches consider many levels of functionnalities. In this article, we will show our work toward the design of a multiple-level architecture using reinforcement learning techniques. We explain our work on the design of a longitudinal ACC controller, which is the first step toward a fully functionnal CACC low-level controller. We describe the design of our high-level controller used for vehicles coordination. Preliminary results show that, in some situations, the vehiclefollowing controller is stable. We also show that the coordination controller allows to have an efficient lane allocation for vehicles. At last, we present some future improvements that will integrate both approaches in a general architecture centered on the design of a CACC system using reinforcement learning.

Partial Local FriendQ Multiagent Learning: Application to Team Automobile Coordination Problem.
Julien Laumonier, Brahim Chaib-draa and , In Canadian AI, 2006, (pdf) (bib).
+AbstractReal world multiagent coordination problems are important issues for reinforcement learning techniques. In general, these problems are partially observable and this characteristic makes the solution computation intractable. Most of the existing approaches calculate exact or approximate solutions using the world model for only one agent. To handle a special case of partial observability, this article presents an approach to approximate the policy measuring a degree of observability for pure cooperative vehicle coordination problem. We compare empirically the performance of the learned policy for totally observable problems and performances of policies for different degrees of observability. If each degree of observability is associated with communication costs, multiagent system designers are able to choose a compromise between the performance of the policy and the cost to obtain the associated degree of observability of the problem. Finally, we show how the available space, surrounding an agent, influence the required degree of observability for near-optimal solution.

Hybrid POMDP Algorithms.
Sébastien Paquet, Brahim Chaib-draa, Stéphane Ross and , In Proceedings of The Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains (MSDM), 2006, (pdf) (bib).
+AbstractWhen an agent evolves in a partially observable environment, it has to deal with uncertainties when choosing its actions. An efficient model for such environments is to use partially observable Markov decision processes (POMDPs). Many algorithms have been developed for POMDPs. Some use an offline approach, learning a complete policy before the execution. Others use an online approach, constructing the policy online for the current belief state. In this article, we present three hybrid algorithms that have been developed to combine the strengths of these two extremes approaches (offline and online). We present results showing that hybrid algorithms can often obtained better results than the online or the offline algorithms alone.

Integrating Social Commitment-Based Communication in Cognitive Agent Modelling.
Philippe Pasquier, Roberto Flores, Brahim Chaib-draa and , In Proceedings of The International Workshop on agent Communication (ACL'06), AAMAS' 06, 2006, (pdf) (bib).
+AbstractIn this paper, we extend the classical BDI architecture for the treatment of social commitments based communication by: (1) linking social commitments and individual intentions, (2) providing a model of the cognitive aspect of communication pragmatics in order to automatize social commitment based communication. In particular, we introduce a general decision-making process leading to attitude change in the appropriate cases.

Modelling the Links Between Social Commitments and Individual Intentions.
Philippe Pasquier, Roberto Flores, Brahim Chaib-draa and , In Proceedings of The 5th International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS'06), 2006, (pdf) (bib).
+AbstractSocial commitments have been increasingly used to model inter-agent dependencies and normative aspects of multiagent systems such as the semantics of agent communication. However, current cognitive agent architecture rest on a formalization of private mental states. In this paper, we propose a modelling of the links between private mental states resulting in individual intentions and social commitments.

An Ontology of Social Control Tools.
Philippe Pasquier, Roberto Flores, Brahim Chaib-draa and , In Proceedings of The 5th International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS'06), 2006, (pdf) (bib).
+AbstractIn multi-agent systems, social commitments are increasingly used to capture roles, social norms, the semantics of agent communication as well as other inter-agent dependencies. Those systems rest on the assumption that agents respect their commitments. In this paper, we present an ontology of sanctions and punishment philosophies which are required ingredients of any social control mechanism susceptible to fosters agents’ compliance with the commitments they create.

An Efficient Resource Allocation Approach in Real-time Stochastic Environment.
Pierrick Plamondon, Brahim Chaib-draa, Abder Rezak Benaskeur and , In Canadian AI, 2006, (pdf) (bib).
+AbstractWe are interested in contributing to solving effectively a particular type of real-time stochastic resource allocation problem. Firstly, one distinction is that certain tasks may create other tasks. Then, positive and negative interactions among the resources are considered, in achieving the tasks, in order to obtain and maintain an efficient coordination. A standard Multiagent Markov Decision Process (MMDP) approach is too prohibitive to solve this type of problem in real-time. To address this complex resource management problem, the merging of an approach which considers the complexity associated to a high number of different resource types (i.e. Multiagent Task Associated Markov Decision Processes (MTAMDP)), with an approach which considers the complexity associated to the creation of task by other tasks (i.e. Acyclic Decomposition) is proposed. The combination of these two approaches produces a near-optimal solution in much less time than a standard MMDP approach.

A Multiagent Task Associated MDP (MTAMDP) Approach to Resource Allocation.
Pierrick Plamondon, Brahim Chaib-draa, Abderrezak Benaskeur and , In AAAI 2006 Spring Symposium on Distributed Plan and Schedule Management, 2006, (pdf). Stanford, California (bib).
+AbstractWe are interested in contributing to solving effectively the a specific type of real-time stochastic resource allocation problem, which is known as NP-Hard, of which the main distinction is the high number of possible interacting actions to execute in a group of tasks. To address this complex resource management problem, we propose an adaptation of the Multiagent Markov Decision Process (MMDP) model which centralizes the computation of interacting resources. This adaptation is called Multiagent Task Associated Markov Decision Process (MTAMDP) and produces a near-optimal solution policy in a much lower time than a standard MMDP approach. In a MTAMDP, a planning agent computes a policy for each resource, and are coordinated by a central agent. MTAMDPs enables to practically solve our NP-Hard problem.

Satisfaction Equilibrium: Achieving Cooperation in Incomplete Information Games.
Stéphane Ross, Brahim Chaib-draa and , In Canadian AI, 2006, (pdf) (bib).
+AbstractSo far, most equilibrium concepts in game theory require that the rewards and actions of the other agents are known and/or observed by all agents. However, in real life problems, agents are generally faced with situations where they only have partial or no knowledge about their environment and the other agents evolving in it. In this context, all an agent can do is reasoning about its own payo s and consequently, cannot rely on classical equilibria through deliberation, which requires full knowledge and observability of the other agents. To palliate to this diculty, we introduce the satisfaction principle from which an equilibrium can arise as the result of the agents' individual learning experiences. We de ne such an equilibrium and then we present di erent algorithms that can be used to reach it. Finally, we present experimental results that show that using learning strategies based on this speci c equilibrium, agents will generally coordinate themselves on a Pareto-optimal joint strategy, that is not always a Nash equilibrium, even though each agent is individually rational, in the sense that they try to maximize their own satisfaction.

Study of social consciousness in stochastic agent-based simulations: Application to supply chains.
Philippe Pasquier, Roberto Flores, Brahim Chaib-draa and , In Proceedings of The 5th International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS'06), 2006, (pdf) (bib).
+AbstractEmpirical game theory allows studying the strategic interactions of agents in simulations. Speci cally, traditional game theory describes such interactions by an analytical model, while empirical game theory employs simulations. In this paper, we use empirical game theory to study how the moreor-less sel shness of agents a ects their behaviour. To this end, we assume that every agent utility can be split in two parts, a rst part representing the direct utility of agents and a second part representing agent social consciousness, i.e., their impact on the rest of the multiagent system. An application to supply chains illustrates this approach. In this application, the collaborative strategy is often used by every company-agent at whatever their same level of social consciousness, which may indicate that every agent is strongly related with one other.

Learning the Required Number of Agents for Complex Tasks.
Sébastien Paquet, Brahim Chaib-draa and , In Proceedings of The Fifth International Joint Conference on Autonomous Agents & Multi Agent Systems (AAMAS-06), 2006, (pdf) (bib).
+AbstractCoordinating agents in a complex environment is a hard problem, but it can become even harder when certain characteristics of the tasks, like the required number of agents, are unknown. In those settings, agents not only have to coordinate themselves on the different tasks, but they also have to learn how many agents are required for each task. To achieve that, we have elaborated a selective perception reinforcement learning algorithm to enable agents to learn the required number of agents. Even though there were continuous variables in the task description, the agents were able to learn their expected reward according to the task description and the number of agents. The results, obtained in the RoboCupRescue, show an improvement in the agents overall performance.

A Technique for Large Automated Mechanism Design Problems.
Frederick Asselin, Brigitte Jaumard, Antoine Nongaillard and , In Proceedings of the IEEE/WIC/ACM Conference on Intelligent Agent Technology (IAT'06), 2006, (pdf) (bib).
+AbstractAutomated mechanism design (AMD) seeks to find, using algorithms, the optimal rules of interaction (a mechanism) between selfish and rational agents in order to get the best outcome. Here optimal is defined by the objective function of the designer of the mechanism where the function has usually some desirable properties (e.g. Pareto optimal). A difficulty with AMD lies in the size of the optimization problem that one needs to solve in order to select the best mechanism: there is a huge number of variables (and constraints but to a lesser extent) even for AMD instances of relatively small size. We study how to adapt the column generation techniques in order to solve the linear programming LP formulation of the AMD problem and compare its efficiency with the classical simplex algorithm for linear programs, on a bartering of goods example. We show that the resulting column generation algorithm is very quickly faster than the simplex algorithm for a fixed number of types (i.e., preference relations) on the goods as the number of goods increases, and then for a fixed number of goods as the number of types increases. Moreover, we show that, as the number of goods increases, the percentage of variables that need to be explicitly considered by the column generation techniques comes down very fast while the simplex algorithm must always consider explicitly all variables.

Resolution-based Policy Search for Imperfect-information Differential Games.
Minh Nguyen-Duc, Brahim Chaïb-draa and , In Proceedings of the IEEE/WIC/ACM Conference on Intelligent Agent Technology (IAT'06), 2006, (pdf) (bib).
+AbstractDifferential games (DGs), considered as a typical model of game with continuous states and non-linear dynamics, play an important role in control and optimization. Finding optimal/approximate solutions for these game in the imperfect information setting is currently a challenge for mathematicians and computer scientists. This article presents a multi-agent learning approach to this problem. We hence propose a method called resolution-based policy search, which uses a limited non-uniform discretization of a perfect information game version to parameterize policies to learn. We then study the application of this method to an imperfect information zero-sum pursuit-evasion game (PEG). Experimental results demonstrate strong performance of our method and show that it gives better solutions than those given by traditional analytical methods.

A Q-decomposition LRTDP Approach to Resource Allocation.
Pierrick Plamondon, Brahim Chaib-draa, Abder Rezak Benaskeur and , In Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT 2006), 2006, (pdf) (bib).
+AbstractThis paper contributes to solve effectively stochastic resource allocation problems known to be NP-Complete. To address this complex resource management problem, the merging of two approaches is made: The Q-decomposition model, which coordinates reward separated agents through an arbitrator, and the Labeled Real-Time Dynamic Programming (LRTDP) approaches are adapted in an effective way. The Q-decomposition permits to reduce the set of states to consider, while LRTDP concentrates the planning on significant states only. As demonstrated by the experiments, combining these two distinct approaches permits to further reduce the planning time to obtain the optimal solution of a resource allocation problem.


2005


Articles in Referred Journals

Agent Communication Pragmatics: The Cognitive Coherence Approach.
Philippe Pasquier, Brahim Chaib-draa and , In Journal of Cognitive Systems Research, 364--395, 2005, (pdf) (bib).
+AbstractDifferent approaches have investigated the syntax and semantics of agent communication languages. However, these approaches have not indicated how agents should dynamically use communications. Instead of filling this pragmatics gap, most approaches have mainly focused on the structure of dialogues even though developers are more interested in agents capabilities of having useful automated conversations with respect to their goals rather than in their abilities to structure dialogues. This led us to work on a theory of the use of conversations between agents. In this paper, we propose a pragmatics theory which extends and adapts the cognitive dissonance theory (a major theory of social psychology) to multi-agent systems by unifying it with the theory of coherence in thought and action that issues from computational philosophy of mind. Precisely, we show how this theory allows us to provide generic conceptual tools for the automation of both agent communicational behavior and attitude change processes. This new motivational model is formulated in terms of constraints and elements of cognition and allows us to define cognitive incoherences and dialogue utility measures. We show how these measures could be used to solve common problems and answer some critical questions concerning agent communication frameworks use. Finally, our exploration in applying the cognitive coherence pragmatics theory as a new communication layer over classical BDI agents is presented. It relies on our dialogue games based agent communication language (DIAGAL) and our dialogue games simulator toolbox (DGS). The resulting framework provides the necessary theoretical and practical elements for implementing our theory. In doing so, it brings in a general scheme for automatizing agents communicational behavior as it is exemplified in this article. 2005 Elsevier B.V. All rights reserved.

A Collaborative Driving System based on Multiagent Modelling and Simulations.
Simon Hallé, Brahim Chaib-draa and , In Journal of Transportation Research Part C (TRC-C): Emergent Technologies, 320--345, 2005, (pdf) (bib).
+AbstractCollaborative driving is a growing domain of intelligent transportation systems (ITS) that makes use of communications to autonomously guide cooperative vehicles on an automated highway system (AHS). In this paper, we address this issue by using a platoon of cars considered as more or less autonomous software agents. To achieve this, we propose a hierarchical driving agent architecture based on three layers (guidance layer, management layer and traffic control layer). This architecture has been used to develop centralized platoons, where the driving agent of the head vehicle coordinates other driving agents by applying strict rules, and decentralized platoons, where the platoon is considered as a group of driving agents with a similar degree of autonomy, trying to maintain a stable platoon. The latter decentralized model mainly considers an agent teamwork model based on a multiagent architecture, known as STEAM. The centralized and decentralized coordination models are finally compared using results from simulation scenarios that highlight safety, time efficiency and communication efficiency aspects for each model. 2005 Elsevier Ltd. All rights reserved.

Book Chapters

Collaborative Driving System Using Teamwork for Platoon Formations.
Simon Hallé, Brahim Chaib-draa and , In Applications of Agent Technology in Traffic and Transportation, Birkhäuser, 2005, (pdf) (bib).
+AbstractCollaborative driving is a growing domain of Intelligent Transportation Systems (ITS) that makes use of communications to autonomously guide cooperative vehicles on an Automated Highway System (AHS). In this paper, we address this issue by using a platoon of cars considered as more or less autonomous software agents. To achieve this, we propose a hierarchical architecture based on three layers (Guidance layer, Management layer and Traffic Control layer), which can be used to develop coordination models for centralized platoons (where a head vehicle-agent coordinates other vehicle-agents by applying its coordination rule) and decentralized platoons (where the platoon is considered as a team of vehicle-agents trying to maintain the platoon). The latter decentralized model mainly considers a software agent teamwork model using architectures like STEAM. These different coordination models will be compared using results on preliminary simulation scenarios, to provide arguments for and against each approach.

Articles in Referred Proceedings

ACL: Specification, Design and Analysis All Based on Commitments.
Mathieu Bergeron, Brahim Chaib-draa and , In Proceedings of the Workshop on Agent Communication (AC2005), fourth International Joint Conference on Autonomous Agents and Multi Agent , Systems (AAMAS 2005), 2005, (pdf) (bib).
+AbstractIn recent years, social commitment based approaches have been proposed to solve problems issuing from previous mentalistic based semantics for agent communication languages. This paper follows the same line of thought since it presents a dialogue game based agent communication language (called DIAGAL) which allows agents to manipulate the public layer of social commitments through dialogue, by creating, cancelling, . . . , updating their social commitments. Then we show how we can study the commitments dynamic to model agent dialogue and we present some metrics that can be used to evaluate the quality of dialogue. Next, we use an example (summer festival organization) to show how DIAGAL can be used in analyzing and modelling automated conversations in offices. Finally, we present the results and analysis of the summer festival simulations that we realized through our dialogue game simulator (DGS).

Decomposition Techniques for a Loosely-Coupled Resource Allocation Problem.
P. Plamondon, B. Chaib-draa, A. Benaskeur and , In Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT 2005), 2005, (pdf). France, 19-22 September, 2005 (bib).
+AbstractWe are interested by contributing to stochastic problems of which the main distinction is that some tasks may create other tasks. In particular, we present a first approach which represent the problem by an acyclic graph, and solves each node in a certain order so as to produce an optimal solution. Then, we detail a second algorithm, which solves each task separately, using the first approach, and where an on-line heuristic computes the global actions to execute when the state of a task changes.

A Layered Model for Message Semantics using Social Commitments.
Roberto A. Flores, Philippe Pasquier and Brahim Chaib-draa, In Proceedings of the 4th International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS'2005), 1323--1324, 2005, poster (bib).

Multiagent Q-Learning: Preliminary Study on Dominance between the Nash and Stackelberg Equilibriums.
Julien Laumonier, Brahim Chaib-draa and , In Proceedings of AAAI-2005 Workshop on Multiagent Learning, 2005, (pdf) (bib).
+AbstractSome game theory approaches to solve multiagent reinforcement learning in self play, i.e. when agents use the same algorithm for choosing action, employ equilibriums, such as the Nash equilibrium, to compute the policies of the agents. These approaches have been applied only on simple examples. In this paper, we present an extended version of Nash Q-Learning using the Stackelberg equilibrium to address a wider range of games than with the Nash Q-Learning. We show that mixing the Nash and Stackelberg equilibriums can lead to better rewards not only in static games but also in stochastic games. Moreover, we apply the algorithm to a real world example, the automated vehicle coordination problem.

Modeling Flexible Social Commitments and their Enforcement.
Philippe Pasquier, Roberto Flores, Brahim Chaib-draa and , In Proceedings of the Fifth International Workshop Engineering Societies in the Agents World (ESAW), M.-P. Gleizes and A. Omicini and F. Zambonelli, 153--165, 2005, (pdf) (bib).
+AbstractFor over a decade, agent research has shown that social commitments support the definition of open multiagent systems by capturing the responsibilities that agents contract toward one another through their communications. These systems, however, rely on the assumption that agents respect the social commitments they adopt. To overcome this limitation, in this paper we investigate the role of sanctions as elements whose enforcement fosters agents’ compliance with adopted commitments. In particular, we present a model of flexible social commitments to which sanctions are attached, and where the enforcement of sanctions act as a social control mechanism for the satisfaction of commitments.

DIAGAL: an ACL ready for Open System.
Philippe Pasquier, Mathieu Bergeron, Brahim Chaib-draa and , In Proceedings of the Fifth International Workshop Engineering Societies in the Agents World (ESAW), M.-P. Gleizes and A. Omicini and F. Zambonelli, 139--152, 2005, (pdf) (bib).
+AbstractIn this paper, we present the latest version of our dialogue games based agent communication language (DIAGAL) which allows the agents to manipulate the public layer of social commitments through dialogue. We show that DIAGAL is complete according to the sequential creation, cancellation, update and discharge of social commitments. We also extend and refine notions of success and satisfaction previously associated with speech-acts to this new dialogical setting. Finally, we explain why DIAGAL is a good candidate for open and heterogeneous MAS development.

An Online POMDP Algorithm for Complex Multiagent Environments.
Sébastien Paquet, Ludovic Tobin, Brahim Chaib-draa and , In Proceedings of The 4th International Joint Conference on Autonomous Agents & Multi Agent Systems (AAMAS'2005), 2005, (pdf) (bib).
+AbstractIn this paper, we present an online method for POMDPs, called RTBSS (Real-Time Belief Space Search), which is based on a look-ahead search to find the best action to execute at each cycle in an environment. We thus avoid the overwhelming complexity of computing a policy for each possible situation. By doing so, we show that this method is particularly efficient for large real-time environments where offline approaches are not applicable because of their complexity. We first describe the formalism of our online method, followed by some results on standard POMDPs. Then, we present an adaptation of our method for a complex multiagent environment and results showing its efficiency in such environments.

Apprentissage de la coordination multiagent: Q-learning par jeu adaptatif.
Olivier Gies, Brahim Chaib-draa and , In Actes des Troisièmes Journées Francophones Modèles Formels de l'Interaction (MFI'2005), 2005, (pdf) (bib).
+AbstractDans le cadre de l’apprentissage multiagent, de nombreux travaux ont cherch?usqu’?r?nt ?tablir des algorithmes convergents vers un ?ilibre de Nash en jeux stochastiques. De tels algorithmes sont cependant limit? dans la mesure o?s sont incapables de g?r la multiplicit?des ?ilibres de Nash et de converger vers l’?ilibre Pareto-optimal si celui-ci existe. Ces algorithmes utilisent g?ralement une convention pour la s?ction de l’?ilibre de Nash le plus appropri?n cas d’?ilibres multiples. Pour palier ?ela, nous proposons un algorithme d’apprentissage ?ndant le Q-learning aux jeux stochastiques non-coop?tifs, qui converge en jeux uniformes (en anglais “self-play”, ce sont des jeux o?us les agents utilisent le m? algorithme d’apprentissage) vers l’?ilibre de Nash Pareto-optimal. Nous pr?ntons des r?ltats exp?mentaux montrant la convergence d’un tel algorithme en jeux homog?s vers un ?ilibre de Nash, en tant qu’?ilibre de meilleure r?nse mutuelle (donc vers un ?ilibre de Nash Pareto-optimal), sans besoin de convention de coordination explicite.

Coordination d'agents à l'aide d'un algorithme en-ligne pour les POMDPs.
Sébastien Paquet, Ludovic Tobin, Brahim Chaib-draa and , In Actes des Troisièmes Journées Francophones Modèles Formels de l'Interacti on (MFI'2005), 2005, (pdf) (bib).
+AbstractThis paper presents an online method for POMDPs based on a look-ahead search to find the best action to execute at each cycle in an environment. The basic idea of our approach, called RTBSS (Real-Time Belief Space Search), is to avoid computing a complete policy. Our approach is especially motivated by real-time environments where the state space is too large to consider traditional algorithms. We first describe the formalism of our online method, followed by some results on standard POMDPs. Then, we present an adaptation of our method for multiagent environments on an example presenting a way to manage the agent’s interactions with the dynamic parts of the world and a coordination method based on the reward function.

Multiagent Systems Viewed as Distributed Scheduling Systems: Methodology and Experiments.
Sébastien Paquet, Nicolas Bernier, Brahim Chaib-draa and , In Proceedings of the 18th Canadian Conference on Artificial Intelligence (AI'2005), 2005, (pdf) (bib).
+AbstractIn this article, we present a design technique that facilitates the work of extracting and defining the tasks scheduling problem for a multiagent system.We also compare a centralized scheduling approach to a decentralized scheduling approach to see the difference in the efficiency of the schedules and the amount of information transmitted between the agents. Our experimental results show that the decentralized approach needs less messages, while being as efficient as the centralized approach.

Real-Time Decision Making for Large POMDPs.
Sébastien Paquet, Ludovic Tobin, Brahim Chaib-draa and , In Proceedings of the 18th Canadian Conference on Artificial Intelligence (AI'2005), 2005, (pdf) (bib).
+AbstractPartially Observable Markov Decision Processes (POMDPs) provide a very general model for sequential decision problems in partially observable environments. The main problem with POMDPs is that their complexity makes them applicable only on small environments. In this paper, we introduce a novel idea for POMDPs that, to our knowledge, has not received a lot of attention. The idea is to use an online approach based on a look-ahead search to find the best action to execute at each cycle in the environment. By doing so, we avoid the overwhelming complexity of computing a policy for every possible situation the agent could encounter. Since there is no computation offline, the algorithm is immediately applicable to previously unseen environments, if the environments’ dynamics are known. Also, since we need a fast online algorithm, we opted for a factored POMDP representation and a branch and bound strategy based on a limited depth first search instead of classical dynamic programming. The tradeoff obtained between the solution quality and the computing time is very interesting.


2004


Articles in Referred Proceedings

Les réseaux d'engagements comme méthode pour modéliser le comportement dialogique des agents.
Mathieu Bergeron, Brahim Chaib-draa and , In Actes des JFSMA 2004, 251--264, 2004, (pdf) (bib).
+AbstractIn this article, we present the web of commitment methodology that allows us to specify the dialogical behavior of agents from the commitments that can be contracted between them and from the links that can exist between those commitments. At first, we present the DIAGAL agent communication language which is based on social commitments and dialogue games that are defined as structures regulating the mechanism under which some commitments are discussed through the dialogue. Then, we present our social commitments model to explain how the agent who uses DIAGAL can use the dialogue games to manipulate the commitments. For that, we introduce the web of commitments concept which makes it possible to specify the causality links that exist between various commitments of a multi-agent system. Finally, we explain using an illustrative example how we could implement, through our simulator, our concepts and ideas.

From Global Selective Perception to Local Selective Perception.
Nicolas Bernier, Sébastien Paquet, Brahim Chaib-draa and , In Proceedings of the 3rd International Joint Conference on Autonomous Agents & Multiagent Systems (AAMAS'2004), Nicolas R. Jennings and Carles Sierra and Liz Sonenberg and Milind Tambe, 1352--1353, 2004, (pdf) (bib).
+AbstractThis paper presents a reinforcement learning algorithm used to allocate tasks to agents in an uncertain real-time environment. In such environment, tasks have to be analyzed and allocated really fast for the multiagent system to be effective. To analyze those tasks, described by a lot of attributes, we have used a selective perception technique to enable agents to narrow down the description of each task, enabling the reinforcement learning algorithm to work on a problem with a reasonable number of possible states.

Selective Perception Learning for Tasks Allocation.
Nicolas Bernier, Sébastien Paquet, Brahim Chaib-draa and , In Proceedings of the AAMAS-04 Workshop on Learning and Evolution in Agent Based Systems, 42--47, 2004, (pdf) (bib).
+AbstractThis paper presents a learning algorithm used to allocate tasks to agents in an uncertain real-time environment. In such environment, tasks have to be analyzed and allocated really fast for the multiagent system to be effective. To analyze those tasks, described by a lot of attributes, we have used a selective perception technique to enable agents to narrow down the description of each task by choosing the attributes that it should be considering in each situation. By doing so, we have obtained a drastic reduction of the number of possible states. We have used this algorithm at two different levels for the problem of choosing the best fire to extinguish for each firefighter agent in the RoboCupRescue simulation environment. First, a center agent is using the algorithm to allocate a zone on fire for each firefighter agent. Then, those agents are choosing the best fire to extinguish in this zone. Our results show a good improvement in the agents capability to extinguish fires, as the agents become better at distinguishing the world states.

A Logical Model for Commitment and Argument Network for Agent Communication.
Jamal Bentahar, Bernard Moulin, John-Jules Ch. Meyer, Brahim Chaib-draa and , In Proceedings of the 3rd International Joint Conference on Autonomous Agents & Multiagent Systems (AAMAS'2004), Nicholas R. Jennings and Carles Sierra and Liz Sonenberg and Milind Tambe, 792--799, 2004, (pdf) (bib).
+Abstract1In this paper we present a semantics for our approach based on social commitments (SCs) and arguments for conversational agents. More precisely, we propose a logical model based on CTL* and on dynamic logic (DL). Called Commitment and Argument Network, our formal framework based on this approach uses three basic elements: SCs, actions that agents apply to these SCs and arguments that agents use to support their actions. The advantage of this logical model is to bring together all these elements and the relations existing between them within the same framework. Our semantics makes it possible to represent the dynamics of agent communication. It also allows us to establish the important link between SCs as a deontic concept and arguments. CTL* enables us to express the temporal characteristics of SCs and arguments. DL enables us to capture the actions that agents are committed to achieve.

Commitment and Argument Network: A New Formalism for Agent Communication.
Jamal Bentahar, Bernard Moulin, Brahim Chaib-draa and , In Advances in Agent Communication, F. Dignum, 146--165, 2004, (pdf) (bib).
+AbstractThis paper proposes a formal framework which offers an external representation of conversations between conversational agents. Using this formalism allows us: (1) to represent the dynamics of conversations between agents; (2) to analyze conversations; (3) to help autonomous agents to take part in consistent conversations. The proposed formalism, called “commitment and argument network”, uses a combined approach based on commitments and arguments. Commitments are used to capture the social and the public aspect of conversations. Arguments on the other side are used to capture the reasoning aspect. We also propose a layered communication model in which the formalism and the approach take place.

A Persuasion Dialogue Game based on Commitments and Arguments.
Jamal Bentahar, Bernard Moulin, Brahim Chaib-draa and , In Proceedings of the AAMAS-04 First International Workshop on Argumentation in Multi-Agent Systems, Iyad Rahwan and Pavlos Moraitis and Chris Reed, 148--164, 2004, (pdf) (bib).
+AbstractIn this paper we propose a new persuasion dialogue game for agent communication. We show how this dialogue game is modeled by a framework based on social commitments and arguments. Called Commitment and Argument Network (CAN), this framework allows us to model communication dynamics in terms of actions that agents apply to commitments and in terms of argumentation relations. This dialogue game is specified by indicating its entry conditions, its dynamics and its exit conditions. In order to solve the problem of the acceptance of arguments, the protocol integrates the concept of agents’ trustworthiness in its specification.

A Pragmatic Approach to Build Conversation Protocols using Social Commitments.
Roberto A. Flores, Robert C. Kremer and , In Proceedings of the 3rd International Jiont Conference on Autonomous Agents & Multiagent Systems (AAMAS'2004), Nicholas R. Jennings and Carles Sierra and Liz Sonenberg and Milind Tambe, 1242--1243, 2004, (pdf) (bib).
+AbstractWe present a model to build conversation protocols aiming at the execution of actions. Our contention is that protocols can be explained as an orderly sequence of messages for adopting and discharging action-entailing social commitments. This model explicitly indicates the messages that are allowed (sequencing) and the agent that is expected to issue the next message (turn-taking) in all conversational states, thus defining state properties upon which the construction and verification of protocols can be based.

Conversational Semantics with Social Commitments.
Roberto A. Flores, Philippe Pasquier, Brahim Chaib-draa and , In Proceedings of the Workshop on Agent Communications at the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems, Rogier van Eijk and Marc-Philippe Huget and Frank Dignum, 19--33, 2004, (pdf) (bib).
+AbstractMessage semantics are traditionally de¯ned in terms of mental states, which is a trend that is criticized for assuming the sincerity and cooperativeness of agents. To circumvent these limitations, several proposals have been put forth to de¯ne the semantics of messages using social commitments. We follow this trend and present a conversational model where the meaning of messages is based on their use as coordinating devices advancing conversations that advance the state of social commitments and the state of the activities in which agents participate.

A Principled Modular Approach to Construct Flexible Conversation Protocols.
Roberto A. Flores, Robert C. Kremer and , In Advances in Artificial Intelligence: 17th Conference of the Canadian Society for Computational Studies of Intelligence, Canadian AI 2004, A.Y. Tawfik and S.D. Goodwin, 1--15, 2004, (pdf) (bib).
+AbstractBuilding conversation protocols has traditionally been an art more than a science, as their construction is often guided by designers' intuition rather than by a principled approach. In this paper we present a model for building conversation protocols using inference principles that allow the computational speci¯cation and veri¯cation of message sequencing and turn-taking. This model, which is based on the negotiation of social commitments, results in highly °exible protocols that support agent heterogeneity while abiding by software engineering practices. We exemplify the speci¯cation of protocols using the contract net protocol, a common interaction protocol from the multiagent literature.

A Decentralized Approach to Collaborative Driving Coordination.
Simon Hallé, Julien Laumonier, Brahim Chaib-draa and , In Proceedings of the 7th IEEE International Conference on Intelligent Transportation Systems (ITSC'2004), 2004, (pdf) (bib).
+AbstractCollaborative driving is an important subcomponent of Intelligent Transportation Systems ITS as it strives to create autonomous vehicles that are able to cooperate in order to navigate through urban traffic by using communications. In this paper, we address this problematic using a platoon of cars considered as a multiagent system. To do that, we propose a hierarchical architecture based on three layers (guidance, management, traffic control) which can be used to develop centralized platoons (where a head vehicle-agent coordinates other vehicle-agents by applying coordination rules) and decentralized platoons (where the platoon is considered as a team of vehicle-agents maintaining the platoon together). We propose the model of teamwork used in multiagent systems as a decentralized alternative to previous coordination centralized on the platoon’s leader and outline its benefits using collaborative driving simulation scenarios.

Collaborative Driving System Using Teamwork for Platoon Formations.
Simon Hallé and Brahim Chaib-draa, In Proceedings of the AAMAS-04 Workshop on Agents in Traffic and Transportation, 35--46, 2004, (pdf) (bib).
+AbstractCollaborative driving is a growing domain of Intelligent Transportation Systems (ITS) that makes use of communications to autonomously guide cooperative vehicles on an Automated Highway System (AHS). In this paper, we address this issue by using a platoon of cars considered as more or less autonomous software agents. To do that, we propose a hierarchical architecture based on three layers (guidance layer, management layer and traffic control layer) which can be used to develop centralized platoons (where a head vehicle-agent coordinates other vehicle-agents by applying its coordination rule) and decentralized platoons (where the platoon is considered as a team of vehicle-agents trying to maintain the platoon). The latter decentralized model will mainly consider a teamwork related model using architectures like STEAM. These different coordination models will be compared using simulation scenarios to provide arguments for and against each approach.

Resource Allocation in Time-Constrained Environments: The Case of Frigate Positioning in Anti-Air Warfare.
Jean-Francois Morissette, Brahim Chaib-draa, Pierrick Plamondon and , In Proceedings of the 5th International Conference on Computer Science Modelling, Computation and Optimization in Information Systems and Management Sciences (MCO'2004), Le Thi Hoai An and Pham Dinh Tao, 463--470, 2004, (pdf) (bib).
+AbstractMaritime environments are known to be very complex environments with tight real-time constraints where it is very difficult to manage resource allocation. This is the case, for example, for a frigate which must position itself in order to use its resources the most effectively possible to increase its chances of survival when the time of air raids comes. Under such very hard constraints, it can often happen that the commander makes errors because of the complexity of the environment or the stress which the situation can generate. We propose here to implement a decision-aid system which suggests the position that the frigate must take. We start by giving an heuristic which evaluates the effectiveness of a position according to the threats found in the environment. Then, we propose an algorithm which treats all the possible rotations and suggests the best regarding a given situation. Finally, we expose the results of our experiments and we comment on them.

Multi-Agent Simulation of Collaborative Strategies in a Supply Chain.
Thierry Moyaux, Brahim Chaib-draa, Sophie D'Amours and , In Proceedings of the 3rd International Jiont Conference on Autonomous Agents & Multiagent Systems (AAMAS'2004), Nicholas R. Jennings and Carles Sierra and Liz Sonenberg and Milind Tambe, 52--59, 2004, (pdf) (bib).
+AbstractThe bullwhip effect is the amplification of the order variability in a supply chain. This phenomenon causes important financial cost due to higher inventory levels and agility reduction. In this paper, we study, for each company in a supply chain, the individual incentive to collaborate to reduce this problem. To achieve this, we simulate a supply chain inspired by the Qu´ebec forest industry, in which each company is an agent that uses one of three ordering schemes. Each ordering scheme represents a level of collaboration. One run of the simulation is done with fifty (50) weeks for each of the 􀀀􀀀   combinations of these 3 ordering schemes among the 6 companies of the simulation. In each run, we evaluate each company’s inventory holding and backorder costs. These outcomes are used to build a game in the normal form, which is next analyzed using Game Theory. In particular, we find two Nash equilibria incurring the minimum cost of the supply chain. We also note that there are no Nash equilibria in which some companies do not collaborate: collaborating companies have no incentive to stop collaboration.

An Agent Simulation Model for the Québec Forest Supply Chain.
Thierry Moyaux, Brahim Chaib-draa, Sophie D'Amours and , In Proc. 8th International Workshop on Cooperative Information Agents (CIA2004), Matthias Klusch and Sascha Ossowski and Vipul Kashyap and Rainer Unland, 226--241, 2004, (pdf). Erfurt, Germany (bib).
+AbstractA supply chain is a network of companies producing and distributing products to end-consumers. The Qu´ebec Wood Supply Game (QWSG) is a board game designed to teach supply chain dynamics. The QWSG provides the agent model for every company in our simulation. The goal of this paper is to introduce this simulation model. For this purpose, we first outline the QWSG, and then describe with mathematical equations each company in our simulation. Finally, three examples illustrate the use of our simulation to study collaboration in supply chains. More precisely, we study incentives for collaboration at both the supply chain and company level.

Multi-Attribute Decision Making in a Complex Multiagent Environment using Reinforcement Learning with Selective Perception.
Nicolas Bernier Sébastien Paquet, Brahim Chaib-draa and , In Advances in Artificial Intelligence: 17th Conference of the Canadian Society for Computational Studies of Intelligence, Canadian AI 2004, A.Y. Tawfik and S.D. Goodwin, 416--421, 2004, (pdf) (bib).
+AbstractChoosing between multiple alternative tasks is a hard problem for agents evolving in an uncertain real-time multiagent environment. An example of such environment is the RoboCupRescue simulation, where at each step an agent has to choose between a number of tasks. To do that, we have used a reinforcement learning technique where an agent learns the expected reward it should obtain if it chooses a particular task. Since all possible tasks can be described by a lot of attributes, we have used a selective perception technique to enable agents to narrow down the description of each task.

Comparison of Different Coordination Strategies for the RoboCupRescue Simulation.
Sébastien Paquet, Nicolas Bernier, Brahim Chaib-draa and , In Proceedings of The 17th International Conference on Industrial & Engineering Applications of Artificial Intelligence & Expert Systems, 987--996, 2004, (pdf) (bib).
+AbstractA fundamental difficulty faced by cooperative multiagent systems is to find how to efficiently coordinate agents. There are three fundamental processes to solve the coordination problem: mutual adjustment, direct supervision and standardization. In this paper, we present our results, obtained in the RoboCupRescue environment, comparing those coordination approaches to find which one is the best for a complex real-time problem like this one. Our results show that a decentralized approach based on mutual adjustment can be more flexible and give better results than a centralized approach using direct supervision. Also, we have obtained results showing that a standardization rule like the partitioning of the map can be helpful in those kind of environments.

A Computational Model for Conversation Policies for Agent Communication.
Jamal Bentahar, Bernard Moulin, John-Jules Ch. Meyer, Brahim Chaib-draa and , In Proceedings of 5th International Workshop on Computational Logic in Multi-Agent Systems (CLIMA V), 2004, (pdf) (bib).
+AbstractIn this paper we propose a formal specification of a persuasion protocol between autonomous agents using an approach based on social commitments and arguments. In order to be flexible, this protocol is defined as a combination of a set of conversation policies. These policies are formalized as a set of dialogue games. The protocol is specified using two types of dialogue games: entry dialogue game and chaining dialogue games. The protocol terminates when exit conditions are satisfied. Using a tableau method, we prove that this protocol always terminates. The paper addresses also the implementation issues of our protocol using logical programming and an agentoriented platform.

Multi-Platform Coordination in Command and Control.
Patrick Beaumont, Brahim Chaib-draa and , In Proceedings of the 3rd International Conference on Knowledge Systems for Coalition Operations (KSCO'2004), 2004, (pdf) (bib).
+AbstractThe use of agent and multiagent techniques to assist human in its daily routine has been increasing for many years, notably in Command and Control (C2) systems. In this article, we focused on multiagent coordination techniques for resources management in realtime C2 systems. The particular problem we studied is the design of a decision-support for anti-air warfare on Canadian frigates. In the case of the several frigates defending against incoming threats, multiagent coordination is a complex problem of capital importance. Better coordination mechanisms are important to avoid redundancy in engagements and inefficient defence caused by conflicting actions. We present different task sharing coordination mechanisms with their evaluation.


2003


Articles in Referred Journals

Modèle des dialogues entre agents : un état de l'art.
Philippe Pasquier, Brahim Chaib-draa and , In In Cognito, Cahiers Romans de Sciences Cognitives, 77--135, 2003, (pdf) (bib).
+AbstractIn this paper, we present a state of the art of agent dialog models according to their theoretical foundations. After sketching the main concepts surrounding dialog models in general and refining the relationship between natural language dialogs and artificial agents dialogs, we present the two main trends in agent dialog modelling: (1) intentional approaches in which dialog should emerge from the chaining of speech acts derived from agents’ intentions through recognition of and reasoning about others’ mental states and (2) conventional and social approaches in which the semantics of communication is expressed in terms of social commitments standing for responsibilities contracted by an agent toward others through conventionally regulated dialogs. Having detailed these two families, we demonstrate the pros and cons of each with respect to theoretical as well as practical issues, while indicating perspectives for future inquiry.

Book Chapters

Car platoons simulated as a multiagent system.
Simon Hallé, Brahim Chaib-draa, Julien Laumonier and , In Proceedings of the 4th Workshop on Agent-Based Simulation, SCS, 2003, (pdf) (bib).
+AbstractCollaborative driving is an important sub-component of Intelligent Transportation Systems ITS and it strives to create vehicles that are able to cooperate in order to navigate through urban traffic by using communications. In this paper, we address this issue by putting emphasis on the simulation of a platoon of cars considered as more or less autonomous software agents. To do that, we propose a hierarchical architecture based on three layers (guidance layer, management layer and traffic control layer) which can be used for simulating a centralized platoon (where a head vehicle-agent coordinates other vehicle-agents by applying its coordination rule) or a decentralized platoon (where the platoon is considered as a team of vehicle-agents trying to maintain the platoon). Such hierarchical architecture is sustained by a simulator that we describe in details. Finally we present our first results concerning the first step of our project and which only focuses on the first level (autonomous longitudinal control) where only the relative distance and speed of the cars are actively controlled.

Agent-Based Simulation of the Amplification of Demand Variability in a Supply Chain.
Thierry Moyaux, Brahim Chaib-draa, Sophie D'Amours and , In Proceedings of the 4th Workshop on Agent-Based Simulation, SCS, 2003, (pdf) (bib).
+AbstractA supply chain is a set of compagnies producing or carrying products to customers. In such a supply chain, the bullwhip effect is the amplification of demand variability, that is a distortion in information when this information travels from one end of a supply chain to the other. Inefficiencies which are due to this effect are excessive inventory, poor customer service, ineffective transportations, missed production schedules... A game called the Beer Game is a widely used classroom exercice for demonstrating the dynamics in a supply chain. We focus on an adaptation of this game to the forest industry: The Queacute;bec Wood Supply Game. We have simulated this game in a spreadsheet program: this first implementation is the base of the multi-agent simulation presented in this paper where intelligents agents represent compagnies. These agents will simulate how compagnies order, produce and store products. In this paper, we describe the supply chain model in the Queacute;bec Wood Supply Game and how he will make it more realistic with agents. However, we present neither our spreadsheet simulation of this game nor our solution but the bullwhip effect.

Edited Books

Modèles formels de l'interaction.
A. Herzig, B. Chaib-draa and , Cépadues, 2003 (bib).

Articles in Referred Proceedings

Coalition Formation with Non-Transferable Payoff for Group Buying.
Frederick Asselin and Brahim Chaib-draa, In Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS'03), Jeffrey S. Rosenschein and Tuomas Sandholm and Michael Wooldridge and Makoto Yokoo, 922--923, 2003, (pdf). July 14--18, Melbourne, Australia (bib).

Towards a Formal Framework for Conversational Agents.
Jamal Bentahar, Bernard Moulin, Brahim Chaib-draa and , In Proceedings of the Agent Communication Languages and Conversation Policies AAMAS 2003 Workshop, Marc-Philippe Huget and Frank Dignum, 2003, (pdf). July 14th 2003, Melbourne, Australia (bib).
+AbstractThis paper proposes a formal framework which offers an external representation of conversations between conversational agents. Using this formalism allows us: (1) to represent the dynamics of conversations between agents; (2) to analyze conversations; (3) to help autonomous agents to take part in consistent conversations. The proposed formalism, called “commitment and argument network”, uses a combined approach based on commitments and arguments. Commitments are used to capture the social and the public aspect of conversations. Arguments on the other side are used to capture the reasoning aspect. We also propose a layered communication model in which the formalism and the approach take place.

Vers une approche pour la modélisation du dialogue basée sur les engagements et les arguments.
Jamal Bentahar, Bernard Moulin, Brahim Chaib-draa and , In Actes des Secondes Journées Francophones Modèles Formels de l'Interaction, Andreas Herzig and Brahim Chaib-draa and Philippe Mathieu, 19--28, 2003, (pdf). 20--22 mai 2003, Lille, France (bib).
+AbstractNous proposons dans cet article une approche pour la mod?sation du dialogue entre agents bas?sur les engagements et les arguments. Le dialogue est vu comme un r?au d’engagements et d’arguments que les agents doivent manipuler correctement pour assurer la coh?nce de leurs conversations. Cette manipulation doit respecter une certaine dynamique de ces engagements. Une dynamique qui est refl?e par la notion d’?t et d’?lution dans le temps. En effet, combiner dans une approche les engagements et les arguments permet de capturer ?a fois l’aspect social et l’aspect raisonnement dans les communications.

DIAGAL: A Tool for Analyzing and Modelling Commitment-Based Dialogues between Agents.
Marc-André Labrie, Brahim Chaib-draa, Nicolas Maudet and , In Advances in Artificial Intelligence, Yang Xiang and Brahim Chaib-draa, 353--369, 2003, (pdf). Proceedings of the 16th Conference of the Canadian Society for Computational Studies of Intelligence (AI 2003) (bib).
+AbstractThis paper overviews our currently in progress agent communication language simulator, called DIAGAL, by describing its use in analyzing and modelling automated conversations in offices. Offices are modelled here as systems of communicative action based on dialogue games. Through such games, people in office engage in actions by making promises, stating facts, asking for information, and so on. And through these actions they create, modify, discharge, cancel, release, assign, delegate commitments that bind their current and future behaviors. To make apparent such commitments, we consider here Agent Communication Language (ACL) from the dialectic point of view, where agents “play a game” based on commitments. Such games based on commitments are incorporated in DIAGAL tool, which has been developed having in mind the following questions: (1) What kind of structure has the game? How are rules specified within the game?; (2) What kind of games’ compositions are allowed?; (3) How participants in conversation reach agreement on the current game? How are games opened or closed?

Multi-Agent Coordination Based on Tokens: Reduction of the Bullwhip Effect in a Forest Supply Chain.
Thierry Moyaux, Brahim Chaib-draa, Sophie D'Amours and , In Proceedings of the 2nd International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS'03), Jeffrey S. Rosenschein and Tuomas Sandholm and Michael Wooldridge and Makoto Yokoo, 670--677, 2003, (pdf). 14--18 July 2003, Melbourne, Australia (bib).
+AbstractIn this paper, we focus on the supply chain as a multi-agent system and we propose a new coordination technique to reduce the uctuations of orders placed by each company to its suppliers in such a supply chain. This problem of amplication of the demand variability is called the bullwhip effect. To reduce such a bullwhip effect, we propose a technique based on tokens to achieve a decentralized coordination. Precisely, classical orders manage the demand itself whereas tokens manage effects on company inventory due to variations of this demand. Finally, the proposed approach is validated by the Wood Supply Game, which is a supply chain model used to make players aware of the bullwhip effect. We experimentally verify that our coordination technique leads to less variable orders (i.e. the standard deviation of orders is reduced) while inventory levels are not excessively high but sufcient to avoid backorders.

Satisfaction distribuée de contraintes et son application à la génération d'un emploi du temps d'employés.
Thierry Moyaux, Brahim Chaib-draa, Sophie D'Amours and , In Proceedings of the 5th Congrès International de Génie Industriel (CIGI), 2003, (pdf) (bib).
+AbstractLa g?ration d'emplois du temps d'employ?(ETP pour Employee Timetabling Problem) est abord?dans cet article d'un point de vue multiagent afin de donner un cadre d'?de ?a coordination dis-tribu?soutenue par la communication. Nous nous sommes pench?plus particuli?ment sur le cas des ?blissements scolaires en cherchant ?ffecter des professeurs et des salles ?es cours en respectant des cr?aux horaires. Ce probl? d'ETP est formalis?ous la forme d'un probl? de satisfaction de contraintes distribu?DCSP pour Distributed Constraints Satisfaction Problem). Un algorithme ?etour arri? r?lvant ce DCSP est alors propos?Un tel algorithme g?re ?a fois l'emploi du temps recher-ch?t la coordination des agents qui vont le suivre. Enfin, nous expliquons comment un tel algorithme a ? impl?nt?t valid?

Coordination à base de jetons pour réduire l'amplification de la variabilité de la demande dans une chaine logistique.
Thierry Moyaux, Brahim Chaib-draa, Sophie D'Amours and , In Proceedings of the 5e Congrès International de Génie Industriel (CIGI), 2003, (pdf) (bib).
+AbstractDans ce papier, nous proposons un m?nisme de coordination permettant de r?ire l’effet coup de fouet (i.e. l’augmentation de la variabilit?e la demande) dans une cha? logisti-que. Ce m?nisme repose sur des jetons permettant ?ne entreprise de s?rer ses commandes en deux composantes : la partie « commande » repr?nte ce dont elle a besoin pour faire face ?a demande du march?lors que la partie « jetons » repr?nte ce dont elle a besoin pour stabili-ser ses inventaires. Pour valider notre approche, nous utilisons un mod? de cha? logistique adapt?u jeu de la bi? (Beer Game) afin de prendre en compte les sp?ficit?de l’industrie foresti? du Qu?c (une province canadienne). Ce mod? est impl?nt?ans un tableur afin de pouvoir comparer exp?mentalement notre approche ?’autres m?odes de passage de com-mandes.

Learning Coordination in RoboCupRescue.
Sébastien Paquet and , In Advances in Artificial Intelligence, Yang Xiang and Brahim Chaib-draa, 627--628, 2003, (pdf). Proceedings of the 16th Conference of the Canadian Society for Computational Studies of Intelligence (AI 2003) (bib).
+AbstractIn this abstract, we present a complex multiagent environment, the RoboCupRescue simulation, and show some of the learning opportunities for the coordination of agents in this environment.

The Cognitive Coherence Approach for Agent Communication Pragmatic.
Philippe Pasquier, Brahim Chaib-draa and , In Proceedings of the 2nd International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS'03), Jeffrey S. Rosenschein and Tuomas Sandholm and Michael Wooldridge and Makoto Yokoo, 544--551, 2003, (pdf). 14--18 July 2003, Melbourne, Australia (bib).
+AbstractDifferent approaches have investigated the syntax and semantic of agent communication. However, all these approaches (including : agent communication languages (ACLs), conversation policies and dialogue games) have not indicated how agents should dynamically use communications. In fact, most of these approaches have mainly focused on ”structure” of dialogues although developers are more interested in agents’ capabilities of having ”useful” conversations in respect to their goals rather than in their abilities of structuring dialogues. This leads us to propose a theory of use of conversations between agents. This pragmatic theory extends and adapts the cognitive dissonance theory (a major theory of social psychology) to multi-agent systems. In this paper, we show how this theory allows us to provide generic conceptual tools for the automation of both agent communicational behavior and attitude change processes. The cognitive coherence that we propose is formulated in terms of constraints and elements of cognition and allows us to define cognitive incoherences and dialogue utility measures. We show how these measures could be used to solve common problems and answer some critical questions concerning agent communication frameworks use. Finally, the theory is illustrated with an example of dialogue games automatic use.

An Exploration in Using Cognitive Coherence Theory to Automate BDI Agents' Communicational Behavior.
Philippe Pasquier, Nicolas Andrillon, Brahim Chaib-draa and , In Proceedings of the Agent Communication Languages and Conversation Policies AAMAS 2003 Workshop, Marc-Philippe Huget and Frank Dignum, 2003, (pdf). July 14th 2003, Melbourne, Australia (bib).
+AbstractThe cognitive coherence theory for agent communication pragmatics allows modelling a great number of agent communication aspects while being computational.This paper describe our exploration in applying the cognitive cohorence pragmatic theory for BDI agents communication. The presented pratical framework rely on your dialogue games based agent communication language (DIAGAL) and our dialogue game simulator toolbox (DGS).It provides the necessary theoritical and pratical elements for implementing the theory as a new layer over classical BDI agents. In doing so, it brought a general scheme for automatizing agent's communication behavior. Finally, we give an example of the resulting system executing.

Engagements, intentions et jeux de dialogue.
Philippe Pasquier, Brahim Chaib-draa and , In Actes des Secondes Journées Francophones Modèles Formels de l'Interaction, Andreas Herzig and Brahim Chaib-draa and Philippe Mathieu, 289--294, 2003, (pdf). 20--22 mai 2003, Lille, France (bib).
+AbstractThis article aims to present the agent reasoning paradigm which is usually implicit beyond the new social approaches for agent communication. In order to propose a pragmatic of agent communication with those approaches, we provide a link between public/social aspects and private cognitions (resulting in intentions). Finally we indicate how those links could be use to automatize DIAGAL's dialogue games use.

A Frigate Movement Survival Agent-Based Approach.
Pierrick Plamondon, Brahim Chaib-draa, Patrick Beaumont, Dale Blodgett and , In Proceedings of the 7th International Conference on Knowledge-Based Intelligent Information & Engineering Systems (KES'2003), V. Palade and R.J. Howlett and L.C. Jain, 683--691, 2003, (pdf). 3--5 september 2003, University of Oxford, United Kingdom (bib).
+AbstractThe position of a frigate to face some threats can augment its survival chances and therefore it is important to investigate this aspect in order to determine how a frigate can position itself during an attack. To achieve that, we propose a first method based on the Bayesian movement, performed by a learning agent, which determines the optimal positioning of the frigate by dividing the defense area into six sectors for weapon engagement and then, it makes efficient use of all the weapons available by using the sectors. The second method that we propose is called Radar Cross-Section Reduction (RCSR) movement and, it aims at reducing the exposed surface of the frigate to incoming threats before their locking phase is over. Preliminary results on these two methods are presented and discussed. Finally, an implementation of a meta-level agent which would make efficient use of both complementary methods is suggested.

Request for Action Reconsidered as a Dialogue Game based on Commitments.
Brahim Chaib-draa, Marc-André Labrie, Nicolas Maudet and , In Communication in Multiagent Systems, Marc-Philippe Huget, 284--299, 2003, (pdf). Appeared also in the preproceedings of the AAMAS 2002 Workshop on Agent Communication Languages and Conversation Policies (bib).
+AbstractThis paper follows recent works in the eld of dialectical models of inter-agent communication. The request action as proposed by Winograd & Flores is reconsidered in an original dialogue game framework, as a composition of different basic games. These basic games are based on commitments of participants and are handled through a "contextualization game" which aims at dening how games are handled (opened, closed, etc.) through the conversation. We show how such model for conversation offers more exibility, considers unexpected messages, and uses various small conversation policies. Finally, we give an overview on the game simulator that we are currently developping.

Proceedings

Advances in Artificial Intelligence.
Y Xiang and B. Chaib-draa, 16th Conf. of the Canad. Society for Comput. Studies of Intelligence (AI-2003), 2003 (bib).


2002


Articles in Referred Journals

Multi-items Auctions for Automatic Negotiation.
Houssein Benameur, Brahim Chaib-draa, Peter Kropf and , In Journal of Information and Software Technology, 291--301, 2002, (pdf) (bib).
+AbstractAvailable resources can often be limited with regard to the number of demands. In this paper we propose an approach for solving this problem which consists of using the mechanisms of multi-item auctions for allocating the resources to a set of software agents. We consider the resource problem as a market in which there are vendor agents and buyer agents which trade on items representing the resources. These agents use multi-item auctions which are viewed here as a process of automatic negotiation, and implemented as a network of intelligent software agents. In this negotiation, agents exhibit different acquisition capabilities which let them act differently depending on the current context or situation of the market. For example, the ”richer” an agent is, the more items it can buy, i.e. the more resources it can acquire.We present a model for this approach based on the English auction, then we discuss experimental evidence of such a model.

Trends in Agent Communication Language.
Brahim Chaib-draa, Frank Dignum and , In Computational Intelligence, 89--101, 2002, (pdf) (bib).
+AbstractThis article aims to present the agent reasoning paradigm which is usually implicit beyond the new social approaches for agent communication. In order to propose a pragmatic of agent communication with those approaches, we provide a link between public/social aspects and private cognitions (resulting in intentions). Finally we indicate how those links could be use to automatize DIAGAL's dialogue games use.

Causal Maps: Theory, Implementation and Practical Applications in Multiagent Environments.
Brahim Chaib-draa and , In IEEE Transactions on Knowledge and Data Engineering, 1--17, 2002, (pdf) (bib).
+AbstractAnalytical techniques are generally inadequate for dealing with causal interrelationship among a set of individual and social concepts. Usually, causal maps are used to cope with this type of interrelationships. However, the classical view of causal maps is based on an intuitive view with ad hoc rules and non precise semantics of the primitive concepts, nor a sound formal treatment of relations between concepts. In this paper, we solve this problem by proposing a formal model for causal maps with a precise semantics based on relation algebra and the sofware tool, CM-RELVIEW, in which it has been implemented. Then, we investigate the issue of using this tool in multiagent environements by explaining through different examples how and why this tool is usful for the following aspects: 1)the reasonning on agents'subjective view, 2)the qualitative distributed decision making, and 3)the organization of agents considered as a holistic approach. For each of these aspects, we focus on the computationnal mechanism developped within CM-RELVIEW to support it.

L'interaction comme champ de recherche.
Brahim Chaib-draa, Robert Demolombe and , In Information-Interaction-Intelligence, 2002, (pdf) (bib).
+AbstractCette pr´eface vise a faire un tour d'horizon de la probl´ematique de l'interaction au travers particulierement des questions suivantes : (1) qu'entend-on par interaction ? ; (2) quels sont les concepts primitifs de l'interaction ? (3) quels formalismes convient-il d'utiliser dans le cadre de l'interaction ? Avec ce tour d'horizon, nous esp´erons faciliter la compr´ehension de ce num´ero sp´ecial d´edi´e auxModeles formels de l'interaction, une s´erie d'articles s´electionn´es a partir des “premieres journ´ees d'´etudes sur les modeles formels de l'interaction” tenues a Toulouse du 21 au 23 mai 2001.

Commitment-based and Dialogue-game based Protocols--News Trends in Agent Communication Language.
Nicolas Maudet, Brahim Chaib-draa and , In Knowledge Engineering, 157--179, 2002, (pdf) (bib).
+AbstractThis survey introduces existing approaches to agent communications languages (ACLs) and particularly, converation policies (CPS) which can be viewed as general constraints on the sequence of semantically coherent messages leading to a goal. Then limitations of these CPs are discussed in details paricularly limitations on flexibility and specification. Finally ACLs are viewed from the dialectic point of view and some approaches are introduced in this context: some focusing on commitment-based protocols and others on dialogue based protocols.

Book Chapters

Aspects formels des systèmes multiagents.
B. Chaib-draa, L. Gaguet and , In Organisation et applications des SMAS, Hermes Lavoisier, 2002 (bib).

Articles in Referred Proceedings

A Method to Optimize Ship Maneuvers for the Coordination of Hardkill and Softkill Weapons within a Frigate.
Dale Blodgett, Pierrick Plamondon, Brahim Chaib-draa, Peter Kropf, Eloi Bosse and , In 7th International Command and Control Research and Technology Symposium (7th ICCRTS), 2002, (pdf) (bib).
+AbstractThe coordination of anti-air warfare hardkill and softkill weapon systems is an important aspect of command and control for a Frigate. Since the effectiveness of a particular weapon varies depending on the orientation of the Frigate with respect to the threats faced, a key element of the coordination process is to maneuver the Frigate to most effectively use all the weapons available. This paper shows that the environment surrounding the Frigate can be divided into six fundamental sectors for weapon engagement. The method to determine the general effectiveness of each sector for the threats faced is shown. A na? Bayes method that determines the optimal positioning of the Frigate to most effectively use the hardkill and softkill weapons is presented. Also discussed are the different types of planners that were investigated for planning engagements for the hardkill and softkill weapon systems. Preliminary results comparing and rating these planners are shown, both with and without the recommended maneuvers.

Toward a Protocol for the Formation of Coalitions of Buyers.
Frederick Asselin, Brahim Chaib-draa and , In Proceedings of the Fifth International Conference on Electronic Commerce Research (ICECR-5), Teodor Gabriel Crainic and Bezalel Gavish, 2002, (pdf) (bib).
+AbstractIt is advantageous to group together for the purchase of a good or a service in order to benefit from a price reduction according to the size of the purchasing group. The bought product must however be the same for all the members of the group and this requires such a group to make compromises on its exact specification. It is therefore necessary, for a given set of consumers, to partition itself into groups (or coalitions) which might satisfy preferences of their members. To this end, we have developed a protocol which finds a “satisfying” partition of consumers. As the search of this partition is equivalent to the -hard problem of generating exact set covers, we have tested our protocol by varying the number of agents and the number of possible specifications with random preferences to see under which conditions we can find a solution in a reasonable amount of time and memory space.

Towards an Agent-Based Approach for Multimarket Package e-Procurement.
Houssein Ben-Ameur, Stéphane Vaucher, Robert Gérin-Lajoie, Peter Kropf, Brahim Chaib-draa and , In Proceedings of the Fifth International Conference on Electronic Commerce Research (ICECR-5), Teodor Gabriel Crainic and Bezalel Gavish, 2002, (pdf) (bib).
+AbstractWhile most e-commerce research focuses on one market based problems, less work has been done on multimarket aggregation. Nowadays it is important to address the multimarket package e-procurement problem if we want to acquire a combination of goods and services from different suppliers and service providers. To achieve this, one should address the issues pertaining to identifying of a company's needs, discovering potential partners and suppliers, gathering distributed information and conducting combined negotiations, creating a seamless of information flow with different heterogeneous markets, suppliers, and partners, and finally concluding transactions. Several commercial e-procurement applications already automate some aspects of the procurement processes, helping decision makers and employees complete their purchasing activity. But none take into account the key aspects of combining goods and services into one aggregated package. Agent-based systems are well equipped to address the challenges of multimarket package e-procurement. Indeed, goal driven autonomous agents aim to satisfy user requirements and preferences while being flexible enough to deal with the diversity of semantics amongst markets, suppliers, service providers, partners and individual sellers. A distributed common shared space, called infospace, comprised of the negotiation exchanges and states, allows for agent coordination, market aggregation, and packages construction. This paper presents some issues and challenges faced in multimarket package e-procurement, and puts forward an agent-based approach to deal with them.

Cohérence et conversations entre agents : vers un modèle basé sur la consonance cognitive.
Philippe Pasquier and Brahim Chaib-draa, In Systèmes multi-agents et systèmes complexes : ingénierie, résolution de problèmes et simulation, Philippe Mathieu and Jean-Pierre Müller, 189--203, 2002, (pdf). Actes des JFIADSMA'02 (bib).

Request for Action Reconsidered as Dialogue Game based on Commitments.
Nicolas Maudet, Brahim Chaib-draa and Marc-André Labrie, In Workshop on Agent Communication Languages and Conversation Policies (at AAMAS´02), Huget, M.-P., 284--299, 2002 (bib).


2001


Articles in Referred Journals

NetSA: une architecture multiagent réutilisable pour les environnements riches en informations.
M. Coté, B. Chaib-draa, N. Troudi and , In Information, Interaction, Intelligence , 39--78, 2001, (pdf) (bib).
+AbstractToday, the web requires software that can operate on heterogenous sources of information which are generally located in an open and dynamic environment (e.g., the internet). It also requires to reconsider some complex applications (as for instance, digital libraries, bank services, insurance services, etc.) as a set of autonomous agents having the ability to achieve some objectives. In this paper, we propose a reusable multiagent architecture integrating new technologies (as for instance: agent technology, KQML, Mediation between agents, web technology, etc.) that help to support such requirements. This architecture, called NetSA, has three levels: the first is devoted to the communication with users, the second deals with information and the third is in charge of extraction and queries of information. Thus, NetSA addresses issues as (i) interaction between agents and agents/users, (ii) reasoning for the mediation between agents and finally, (iii) information through its interaction with legacy systems. We have specified, developed and validated such architecture. To validate it, we have opted for a competition between three banks so that they offer the best mortgage to users. To achieve that, we have studied and developed algorithms from auction theory that can optimize sellers or buyers according to some conditions. Our first results have shown that NetSA (a) can be accessible by a maximum of 15 users (for more users, it might be useful to use many NetSA architectures); (b) offers an efficient information seeking than with classical tools; (c) is easy to use; (d) is very useful if one wants to access to legacy systems.

Book Chapters

Agent et Systèmes Multiagents.
B. Chaib-draa, B. Moulin and I. Jarras, In Principes et architecture des systèmes multi-agent, Hermes Lavoisier, 2001 (bib).

Causal Reasoning in Multiagent Environments.
B. Chaib-draa, In Encyclopedia of Microcomputers, 2001 (bib).

CM-RELVIEW: A Tool for Causal Reasoning in Multiagent Environments.
B. Chaib-draa, In Encyclopedia of Computer Science and Technology, 2001 (bib).

Articles in Referred Proceedings

Vers un outil d'analyse et de synthèse pour les systèmes multiagents basé sur l'algèbre relationnelle.
B. Chaib-draa and , In Proc. of Modèles Formels pour l'Interaction, 2001, (pdf) (bib).
+AbstractDans le cadre des syst?s multiagents (SMAs), nous utilisons pr?ntement diff?nts outils formels d'analyse et de synth?. Ces outils formels sont bas?pour la plupart, sur la logique, la prise de d? cision et les m?nismes de march? notre avis, ces outils doivent ?e compl?s par le calcul relationnel si on veut repr?nter et raisonner sur les structures sociales pour lesquelles, les relations entre agents sont incontournables. La principale contribution de ce papier r?de dans l'introduction d'une approche formelle bas?sur le calcul relationnel en vue de raisonner sur les relations aussi bien "enti?s" que "foues". Les ?ments importants que couvre cet article sont pour l'essentiel : (1) les notions de base du calcul relationnel ; (2) l'utilisations des relations enti ?s dans les SMAs et, (4) l'utilisation des relations oues dans les SMAs.

Révision des croyances dans un environnement multiagent : une approche basée sur la crédibilité et les arguments.
I. Jarras, B. Chaib-draa and , In Proc. of Modèles Formels pour l'Interaction, 2001, (pdf) (bib).
+AbstractPeu de recherches se sont pench? sur la probl?tique de la r?sion des croyances dans un cadre multiagent. En tout cas, ?otre connaissance, aucune ne s'est pench?sur la r?sion des croyances tenant compte de la cr?bilit?es informateurs, tout en gardant trace des arguments en faveur de la r?sion, une fois celle-ci effectu? C'est ce probl? qui nous a motiv?t pour lequel, nous proposons ici, une approche formelle bas?sur la logique ?quet?

CM-RELVIEW: A Tool for Causal Resoning in Multiagent Environments.
B. Chaib-draa and , In Proc. of Intelligent Agent Technology-IAT'01, 2001, (pdf) (bib).
+AbstractAnalytical techniques are generally inadequate for dealing with causal interrelationships among a set of individual and social concepts. In this paper, we present a software tool called CM-RELVIEW based on relational algebra for dealing with such causal interrelationships. Then we investigate the issue of using this tool in multiagent environments, particularly in the case of: (1) the qualitative distributed decision making and, (2) the organization of agents considered as a wholistic approach. For each of these aspects, we focus on the computational mechanisms developed within CMRELVIEW to support it.

Automated Negotiation based on Multi-Items Auctions.
Ben-Ameur Houssein, B. Chaib-draa, P. Kropf and , In Proc. of Agent Oriented Information Systems-AOIS'01, 2001, (pdf) (bib).
+AbstractAvailable resources can often be limited with regard to the number of demands. In this paper we propose an approach for solving this problem using the mechanisms of multi-item auctions for allocating the resources to a set of software agents. We consider the resource allocation problem as a market with vendor and buyer agents participating in a multi-item auction. The agents exhibit different acquisition capabilities which let them act differently depending on the current context or situation of the market. We present a model for this approach based on the English auction, and discuss experimental evidence of such a model.

A teamwork test-bed for a decision support system.
Brahim Chaib-draa, Peter Kroft, Sébastien Paquet and , In Proceedings of EUROSIM 2001: Shaping Future with Simulation, 2001, (pdf) (bib).
+AbstractResource management in complex socio-technical systems (as management and control (road, rail, sea, air), industrial engineering systems, transportation logistics, etc.) is a central and crucial process. The many diverse components involved together with various constraints such as real-time conditions make it impossible to devise exact optimal solutions. In this article, we present an approach to the resource management problem based on the multi-agent paradigm to be applied in the context of a shipboard command and control (C2) system. A general architecture for multi-agent planning and scheduling for achieving a common shared goal together with a real-time simulation environment as well as a simulation test-bed using the agent teamwork approach is described.

Coordinating Plans for Agents Performing AAW Hardkill and Softkill for Frigates.
Dale Blodgett, Sébastien Paquet, Pierrick Plamondon, Peter Kropf and , In Proceedings of The 2001 AAAI Fall Symposium Series, 2001, (pdf) (bib).
+AbstractThe coordination of anti-air warfare (AAW) hardkill (HK) and softkill (SK) weapon systems is an important aspect of command and control for the HALIFAX Class Frigate. This led to the development of a rapid prototyping environment, described here, which supports the investigation of methods to coordinate the plans produced by AAW HK and SK agents. The HK and SK planning agents are described. An overview of agent coordination methods is provided, with a focus on our initial approach to HK and SK coordination via a Central Coordinator. This approach was successfully implemented, and proved effective in mitigating interference between HK and SK actions, and improved the overall survivability of the Frigate. Finally, future directions of this research are presented.

Hyper-Game Analysis in Multi-Agent Systems.
B. Chaib-draa, In AAAI Spring Symp. on Game and Decision Making in Multiagent Systems, 2001 (bib).


2000


Articles in Referred Proceedings

Resource Management in Socio-Technical Systems: A Multi-Agent Coordination Framework.
P. Kropf, B. Chaib-draa, B.A. Chalmers and , In HMS 2000, 65--70, 2000, (pdf) (bib).
+AbstractResource management in complex socio-technical systems is a central and crucial task. The many diverse components involved together with various constraints such as real-time conditions make it impossible to devise exact optimal solutions. In this article, we present an approach to the resource management problem based on the multiagent paradigm to be applied in the context of a shipboard command and control (C2) system. A general architecture for multiagent planning and scheduling for achieving a common shared goal together with a real-time simulation environment as well as a simulation test-bed using the agent teamwork approach is described.


1999


Book Chapters

Analyse et modélisation des discours: des conversations humaines aux interactions entre agents logiciels.
B. Moulin, S. Delisle, B. Chaib-draa and , In Analyse et simulation de conversations, L'interdisciplinaire informatique, 1999 (bib).

Edited Books

Analyse et simulation de conversations.
B. Moulin, S. Delisle, B. Chaib-draa and , L'interdisciplinaire informatique, 1999 (bib).

Articles in Referred Proceedings

A Simulation Approach based on Negotiation and Cooperation between Agents: A case Study.
K. Fisher, B. Chaib-draa, et al. and , In IEEE Trans. on Systems, Man, and Cybernetics, 531--545, 1999, (pdf) (bib).
+AbstractAbstract—This paper begins by presenting AGENDA, a simulation tool developed for the simulation and design of applications involving interacting entities. This testbed consists of two different levels, the architecture level and the system-development level. The architecture level describes a methodology for designing software agents by providing several important functionalities an agent should have. On the other hand, the system-development level provides the basic knowledge-representation formalism, general inference mechanisms, and simulation tool-box supporting visualization and monitoring of agents. Following this, the applicability of AGENDA to the transportation domain is presented in detail. The main challenge of AGENDA in the context of this domain has been to provide different cooperation-scalable methods based on negotiation, leading to different scheduling mechanisms, and to experimentally evaluate these mechanisms. This evaluation shows that: 1) AGENDA is suitable for the realistic application of the transportation domain; 2) mechanisms used for the vertical negotiation (between trucks considered as agents) and for the horizontal negotiation (between companies considered as agents) are applicable for the real-world application of the transportation domain. Finally, a complete study of the scalability of the simulation tool and the algorithms used for the negotiation is presented. This study, with the evaluation of the different mechanisms, can help designers of transportation companies, particularly in the case of large companies.

Conversations are Social Activities.
B. Chaib-draa and , In Worshop on Agent Communication Language, ACL'99, 1999, (pdf) (bib).
+AbstractThis paper proposes to see agent communication language (ACL) as a joint activity and not as the sum of the speaker's and hearer's (speech) acts. In this paper, a conversation in the context of ACL is viewed as a joint activity which can be realized as sequences of smaller actions, many of which are themselves joint actions. Social agents which participate to this joint activity have to coordinate their joint actions. In each joint act, the participants face a coordination problem: which actions are expected? The answer to this question proposed here, is based on complex notions as collective intention, joint plan, joint commitments and the notion of common ground.

CM-RELVIEW: A Tool for Causal Reasoning in Multiagent Environments.
B. Chaib-draa, In PRIMA Conf. on Multiagent Systems, 1999 (bib).


1998


Articles in Referred Journals

A Relational Model of Cognitive Maps.
B. Chaib-draa and J. Desharnais, In International Journal of Human-Computer Studies, 181--200, 1998, (pdf) (bib).

Aspects Statiques et Dynamiques des Croyances.
S. Djeffal, B. Chaib-draa and , In Revue d'Intelligence Artificielle, 103--123, 1998, (pdf) (bib).

NetSA, Une Architecture Multiagent pour la Recherche sur Internet.
Marc Coté et Nader Troudi. and , In Expertise Informatique, 1998, (pdf) (bib).
+AbstractAussi longtemps qu’Internet poursuivra son ?lution, nous continuerons ?tre submerg?par des donn?, sans que celles-ci soient toutefois structur?. Le recherche d’information dans ce cadre devient une t?e difficile et les m?odes traditionnelles de recherche sur Internet ou sur des bases de donn? s’av?nt de plus en plus limit?. Les syst?s d’informations coop?tifs bas?sur les agents logiciels apportent de solutions prometteuses ?et ?neux probl?. Dans cet article, nous d?ivons NetSA : une architecture de syst? multiagent pour la recherche d'information dans des sources h?rog?s et r?rties.

Articles in Referred Proceedings

Agent Communication Language: A Semantics based on the Success, Satisfaction and Recursion.
B. Chaib-draa, D. Vanderveken and , In Proc. Agent Theories, Archit. and Lang., 1998, (pdf) (bib).
+AbstractSearle and Vanderveken's model of speech acts is undoubtedly an adequate model for the design of communicating agents because it o ers a rich theory which can give important properties of protocols that we can formalize properly. We examine this theory by focusing on the two fundamentals notions, success and satisfaction, which represent a systematic, uni ed account of both the truth and the success conditional aspects. Then, we propose an adequate formalism{the situation calculus{for representing these two notions (in a recursive way) in the context of agent communication language. The resulting framework is nally used for (1) the analysis and interpretation of speech acts; (2) the semantics and descriptions of agent communication languages.

Vers des agents logiciels considérés comme de systèmes logiques : une approche basée sur la logique LDS.
I. Jarras and B. Chaib-draa, In Actes Journées Jeunes Chercheurs en Intelligence Artificielle, 1998 (bib).

Modélisation du Raisonnement Multiagent : une Approche Basée sur les Labels.
I. Jarras, Chaib-draa and , In Actes des 6èmes Journées Francophones en IA distribuée et Systèmes Multiagents, JFIAD-98, 1998, (pdf) (bib).
+AbstractDevant l’int?t sans cesse grandissant aux syst?s multiagents durant cette derni? d?de, le d?loppement d’outils formels pour l’analyse, la description et l’implantation de ces syst?s est, aujourd’hui, plus que n?ssaire. La plupart des m?odes formelles d?lopp? jusqu’?ate sont bas? sur la s?ntique des mondes possibles. Cette derni? bien qu’?gante est handicap?par deux grands probl?s : 1) le probl? de l’omniscience et 2) le probl? de m?nisation. Dans notre approche, un agent est d?ni comme ?nt un syst? LDS muni d’un ensemble de m?nismes comme l’action, l’abduction et la mise ?our. Dans le pr?nt article, nous pr?ntons une mod?sation d’agents par des syst?s logiques bas?sur les LDS (syst?s d?ctifs ?quet? de Gabbay. Le mod? obtenu est appliqu?ar la suite au probl? bien connu des n sages (raisonnement sur autrui).

Une approche basée sur l'arrière fond conversationnel et l'intention collective pour les conversations entre agents logiciels.
L. Vongkasem, B. Chaib-draa and , In Actes des 6èmes Journées Francophones en Intelligence Artificielle Distribuée & Systèmes Multi-Agents, 1998, (pdf) (bib).
+AbstractDans notre recherche sur la conversation automatique pour certaines situations particuli?s, nous nous sommes int?ss?aux travaux de Trognon & Brassac (not?T&B) et de Searle. T&B [Trognon & Brassac 97] proposent un mod? pour l'encha?ment conversationnel, duquel nous avons relev?’apr?notre compr?nsion, deux r?es d'encha?ment conversationnel : la r?e de la recherche de la r?nse positive et la r?e de la flexibilit?e la conversation. Searle [Searle 92], quant ?ui, part de l'?de de la conversation dans sa globalit?t essaye de retrouver si possible les actes du langage. Selon lui, il n'existe pas de r?larit?intrins?e) dans une conversation de type g?ral, et par cons?ent, il ne peut exister de mod? th?ique (explicatif). Il a n?moins mis en ?dence, deux concepts qui semblent prometteurs : l'intention collective (we-intention) et l'arri?-fond (background). Nous pensons initier notre recherche des conversations automatiques par l’approfondissement de ces deux concepts et l’int?ation d’autres m?nismes cognitifs.

A Relational Modelling of Cognitive Maps.
B. Chaib-draa and , In Advances in Artificial Intelligence, 12th Biennal Conf. on AI, Mercer R. E. and E. Neufeld, 1998 (bib).
+AbstractA useful tool for causal reasoning is the language of cognitive maps developed by political scientists to analyse, predict and understand decisions. Although, this language is based on simple inference rules and its semantics is ad hoc, it has many attractive aspects and has been found useful in many applications: administrative sciences, game theory, information analysis, popular political developments, electrical circuits analysis, cooperative man?achines, distributed group-decision support and adaptation and learning, etc. In this paper, we show how cognitive maps can be viewed in the context of relation algebra, and how this algebra provides a semantic foundation that helps to develop a computational tool using the language of cognitive maps. 1998 Academic Press


1997


Articles in Referred Journals

Coordination in CE Systems: An Approach Based on the Management of Dependencies Between Agents.
S. Lizotte, B. Chaib-draa and , In CERA: Concurrent Engineering: Research and Applications, 367--377, 1997, (pdf) (bib).
+AbstractCoordination is a crucial problem in CE systems and it is neither easy to obtain nor to maintain. Our work is an attempt to develop a general model for coordination which can be adapted for some situations in the context of CE. For this purpose, the coordination denition developed by Malone [25] has been adopted. Coordination is then dened as the process of managing dependencies between activities. In this context, a theoretical model is presented that allows one to determine how to model an agent's activities and how to detect dependencies between those activities. In our model, major concepts are developed in terms of components of coordination, situations of coordination, coordination mechanisms and the coordination process. In this paper, we detail this model and then, we present an illustrative example and nally, we identify the current status and the future evolution of our approach.

Articles in Referred Proceedings

Stratégies de négociation entre agents dans le domaine du transport.
M. Sassi, B. Chaib-draa and , In Actes des 5èmes Jour. Franc. en Intelligence Artificielle Distribuée-Systèmes Multi-Agents, J. Quinquetin and M-C Thomas and B. Trousse, 279--294, 1997, (pdf) (bib).
+AbstractIt is generally established that negotiation and planning problem in the transportation domain is a complex problem. To contribute to this problem, we propose in this paper a multi-agent approach. In this approach, trucks considered as rational and autonomous "intelligent agents" negotiate their tasks by selling and buying their tasks. This sort of simulated trading allows them to reach a compromise which minimise their global plan. We then associate to the simulated trading some heuristics (IDA*, Tabu and Simulated Annealing) to improve it. Results of negotiation between trucks using this "new" simulated trading are discussed in details.

Causal Reasoning in Multiagent Systems.
B. Chaib-draa, In Multi-Agent Rationality, MAAMAW'97, M. Boman and W. Van de Velde, 1997 (bib).

Database Meet Distributed AI.
G. Babin, Z. Maamar and B. Chaib-draa, In Fisrt International Workshop on Cooperative Information Agents CIA'97, M. Klush, 1997 (bib).

Connection Between Micro and Macro Aspects of Agent Modeling.
B. Chaib-draa, In Proceedings of Autonomous Agents AA'97, 1997 (bib).


1996


Articles in Referred Journals

A Design Methodology For Real-Time Systems to be Implemented on Multiprocessors Target Machine.
L. Zhang and B. Chaib-draa, In Journal of Systems and Software, 37--56, 1996 (bib).

Hierarchical Model and Communication by Signs, Signals and Symbols in Multiagent Environments.
B. Chaib-draa, P. Levesque and , In Journal of Experimental and Theoretical AI, 7--20, 1996, (pdf) (bib).
+AbstractIn this paper, a framework based on the skills, rules and knowledge taxonomy of Rasmussen is proposed. Precisely, a reflexive level is developed so as to reflect the fully automated activities, then a rule level to reflect stereotyped actions, and finally a knowledge level to reflect conscious activities involving distributed decision making. In fact, the basic goal of this framework is twofold: first, not to force processing to a higher level (i.e. the knowledge level) than the situation requires, and to support each of three levels of cognitive control. More precisely, the proposed framework should allow agents to prefer skills and rules levels rather than the higher knowledge because it is generally easier to obtain and maintain coordination between agent in routine and familiar situations than in unfamiliar situations. The framewok should also support each of the three levels because complex tasks combined with complex interactions require all levels. To permit agent to rely on low levels, a suggestion is developped. When it is possible, agents have to communicate by signals and signs since signals generally invoke a stimulus or a reaction that is a routine situation, whereas signs generally activate familiar situations. Finally, implementation and experiments demonstrated, on some scenarios of urban traffic, the applicability of concepts developed in this article.

Interaction Between Agents in Routine, Familiar and Unfamiliar Situations.
Brahim Chaib-draa and , In Inteernational Journal of Intelligent & Cooperative Information Systems, 1--25, 1996, (pdf) (bib).
+AbstractA framework for designing a multiagent system (MAS) in which agents are capable of coordinating their activities in routine, familiar, and unfamiliar situations is proposed. This framework is based on the skills, rules and knowledge (S-R-K) taxonomy of Rasmussen. Thus, the proposed framework should allow agents to prefer the lower skill-based and rule-based levels rather than the higher knowledge-based level because it is generally easier to obtain and maintain coordination between agents in routine and familiar situations than in unfamiliar situations. The framework should also support each of the three levels because complex tasks combined with complex interactions require all levels. To permit agents to rely on low levels, a suggestions is developed: agents are provided with social laws so as to guarantee coordination between agents and minimize the need for calling a central coordinator or for engaging in negotiation which requires intense communication. Finally, implementation and experiments demonstrated, on some scenarios of urban trac, the applicability of major concepts developed in this article.

Evaluation de diverses méthodes pour des problèmes d'allocation de règles dans les systèmes de production parallèles.
Hassaine F., B. Chaib-draa and , In Revue d'Intelligence Artificielle, 496--497, 1996 (bib).

Book Chapters

A Review of Distributed Artificial Intelligence.
B. Moulin and B. Chaib-draa, In Foundations of Distributed Artificial Intelligence, Wiley, 3--55, 1996 (bib).

Articles in Referred Proceedings

Reasoning on Conflicts and Negotiation Through Causal Maps.
B. Chaib-draa, In Proc. of Sec. Int. Conf . on Multi-Agent Systems, 1996, Poster (bib).

A Hierarchical Model of Agent Based on Skill, Rules, and Knowledge.
B. Chaib-draa, In Advances in Artificial Intelligence, 11th Biennal Conf. on AI, McCalla, 1996 (bib).

Structures relationnelles pour les interactions entre agents.
K. Lechilli and B. Chaib-draa, In Actes des 4èmes Jour. Franc. en Intelligence Artificielle Distribuée-Systémes Multi-Agents, 1996 (bib).


1995


Articles in Referred Journals

Industrial Applications of Distributed AI.
Brahim Chaib-draa, In Communication of ACM, 49--53, 1995, (pdf) (bib).

Articles in Referred Proceedings

Coordination en situation non familières.
S. Lizotte and B. Chaib-draa, In Actes des 3èmes Journ ées Francophones en Intelligence Artificielle Distribuée & Systémes Multi-Agents, 255--266, 1995 (bib).


1994


Book Chapters

Distributed Artificial Intelligence: An Overview.
B. Chaib-draa, In Encyclopedia of Computer Science and Technology, 1994 (bib).

Articles in Referred Proceedings

A Relation Graph Formulation for Relationships Between Agents.
B. Chaib-draa, J. Desharnais and S. Lizotte, In Proceedings of 13th International DAI Workshop, 1994 (bib).

Hierarchical Model and Communication by Signs, Signals and symbols in Multiagent Environments.
B. Chaib-draa and P. Levesque, In Modeling Autonomous Agents in a Multi-Agent World, MAAMAW'94, 1994, Appeared also in Distributed Software Agents and Applications, Perram J. W. and Müller (eds) (bib).


1992


Articles in Referred Journals

Trends in Distributed Artificial Intelligence.
Brahim Chaib-draa, Bernard Moulin, R. Mandiau and P. Millot, In Artificial Intelligence Review, 35--66, 1992 (bib).