|Abstract:||Large scale networked embedded systems are becoming increaingly popular with the technology advances in wireless network, energy efficient hardware, and cost effective manufacturing. Parameter tuning of such large scale systems has a significant effect on the performance metrics such as reliability, availability, longevity, and energy consumption. The parameter tuning should be based on a performance evaluation which requires a model that closely represents the real system. In modeling a large scale system, an abstraction of the state space of the system is necessary in order to avoid the state explosion problem. Moreover, users not only, need an expressive and accurate way to describe the performance criteria in terms of the model, they also need a way to evaluate the performance criteria against the model. They can do such evaluation manually for certain well-known properties or they can explore unknown properties with the aid of computers.
Many large scale systems have a stochastic behavior. Such stochastic behavior is the result of the randomness in the systems' operating environments. It can also be the result of the use of randomized protocols that are used to reduce the need for costly synchronization. We abstract a system state as a probability mass function (pmf); a pmf is succinct in representation and is informative enough to describe many useful aggregate behaviors of the system. We model the dynamics of the state transitions as a Discrete Time Markov Chain (DTMC) and develop an on-line model estimation method as well as a goodness of fit test that statistically validates the model. We develop a probabilistic temporal logic called iLTL in order to specify aggregate behaviors of large scale systems. Many important performance criteria on DTMC models can be accurately expressed in iLTL. For the iLTL specifications, performance evaluation is systematically performed through a process of model checking. If a performance criterion is not satisfiable, the model checker returns a sequence of computations that violates the specification in order to help users fix the problem. We show the usefulness of our methodology through an experiment with a wireless sensor network of 90 nodes.