Middleware'98

Middleware 98 | Conference report | Proceedings


Conference report

Welcome message
List of delegates
Sponsors
Wireless network
Photographs
Conference team

Final programme
Session 1
Session 2
Session 3
Session 4
Session 5
Session 6
Session 7
Session 8
Session 9
Session 10
WIPS session
Poster session


Session 3: Availability and Integrity - Report

Session Chair

Geoff Coulson

Geoff Coulson received a first class honours degree in Computer Science and Ph.D. in the area of systems support for multimedia applications from the Lancaster University, UK. He is currently a lecturer at Lancaster and is involved in a number of research projects. These include the SUMO project which is next generation middleware, and WAND, a European Commission funded collaborative project which is developing a mobile ATM demonstrator. His research interests are distributed systems architectures, multimedia communications and operating system support for continuous media. He has organised and served on the program committes of numerous conferences and workshops in his area.


Highly Available Trading System: Experiments With CORBA

X. Defago, K. Mazouni and A. Schiper
Ecole Polytechnique Federale de Lausanne, Switzerland

Presented by Xavier Defago

This talk detailed work which has been done in a project for the Swiss Stock Exchange. The stock exchange computer system consists of an exchange system which is fully connected to a number of access points. The project was concerned with the access systems that connect via these access points. In a stock exchange system, timeliness is very important. An existing solution based on ISIS is already in place, but this does not offer a scalable solution since it suffers from a positive acknowledgement bottleneck. Also, ISIS is a product and products die while standards survive. CORBA offers a better solution since it does not suffer from the same acknowledgement problem, plus it is also a standard.

Critical to the performance of the stock exchange system is the manner in which slow traders are connected: they should not bring the entire system to its knees. Also, when failure occurs, it is no longer satisfactory to conduct state transfers which monopolise the network and block the primary server. Such systems need to be operational 24 hours a day. The proposed solution transfers data in small chunks without imposing such restrictions on other users.

Xavier concluded by saying that state transfer mechanisms such as ISIS are inadequate in this context as applications with timing requirements cannot tolerate blocking. CORBA offers a more appropriate solution.

There was a question from the floor asking whether secondary servers process messages at the same time as primaries. Xavier answered that they do, but do not transmit responses. Also, he pointed out that messages are time-stamped to ensure the same ordering at all destinations.


Integrity Management in GUARDS

E. Totel, Lj. Beus-Dukic, J.-P. Blanquart, Y. Deswarte, D. Powell and A. Wellings
LAAS-CNRS, France / University of York, UK / Matra Marconi Space, France

Presented by Eric Totel

The GUARDS project features 8 partner institutions and has three prospective end-users with widely varying requirements - railway, nuclear and space. The goals of the project are tolerance of physical faults and the use of off-the-shelf components. Eric suggested that the approach was to enable software of different criticalities to safely share common resources: the "integrity dimension".

Criticality is linked to the consequences of potential failure and non-critical software may contain bugs which could corrupt highly critical software. By using a combination of spatial firewalls (memory protection) and temporal firewalls (scheduling policies and budget/watchdog timers), GUARDS is able to protect software of different criticalities from other software. The scheduling policy takes the form of a pre-emptive, priority based scheme. The integrity model assigns a level of integrity to each object in the system and these are managed by validation objects. Objects may be multi-level in which case their level of integrity is dynamic. Eric concluded by saying this led to a flexible solution which introduced a negligible overhead compared to the time it takes for invocations.


The Realize Middleware for Replication and Resource Management

P. Melliar-Smith, L. Moser, V. Kalogeraki and P. Narasimhan
University of California, Santa Barbara, USA

Presented by Michael Melliar-Smith

Michael began by outlining his objectives: to build complex applications that are dependable, adaptable and evolable by exploiting networking and replication, and to simplify network programming. There is a need to provide multiple servers for the purposes of fault tolerance and maintain consistency between these, although it is necessary to hide this from the application programmer who should be unaware of replication.

Realize is built on top of TOTEM and features an object, rather than message, oriented API. The system sits between CORBA and the network, intercepting IIOP messages before they reach the TCP stack (this works unmodified with commercial ORBs). Intercepted data is dealt with then transmitted using UDP and a reliable, totally ordered multicast protocol as supplied by TOTEM. Resources are managed by a resource manager and elements serviced to soft real-time. The system has a range of monitoring functions and uses a least-laxity scheduling algorithm. Single processor oriented scheduling algorithms are inappropriate since tasks may migrate across many processors over time. Michael closed by commenting that when migrating objects, it is important to focus on reducing the queuing delay: the processing delay is fixed anyway.

Several questions emerged from the floor. First, the implied heterogeneity of Realize was questioned. Michael replied that the use of Java offered a machine-independent execution environment while CORBA (at least in theory!) solved the problem of externalisation. When quizzed about locating object groups rather than objects, Michael answered that object references were not sent out, only object group references. In answer to two further questions, it was stated that "first answer back" was the current approach but majority voting could also be achieved and that duplicates were recognised through the use of operation identifiers.


Panel

Chair: Geoff Coulson
XD: Xavier Defago
ET: Eric Totel
MMS: Michael Melliar-Smith

Question: Was ISIS so inappropriate? Are you not ignoring some of the guarantees it offers? And aren't negative acknowledgements only part of the problem? What about information going to only a subset of traders - potential insider dealing allegations?
XD: A process that is too slow will lose messages, so crash it. ISIS does just that. It's a different protocol but the result is the same - you can't just slow down the whole system.
MMS: People often forget than hardware is cheap while programmers are expensive.
XD: But when that same hardware is replicated across many places...

John Bates: When you're transferring state, events are still arriving. How do you handle this?
XD: Order is recreated by timestamp. States changes go straight into the database while requests are buffered for later processing.

Question: You're assuming individual method executions don't place a high load on the system. Therefore there are lots of method executions? How much overhead does this create?
MMS: We don't yet have the answer to that. We need to investigate the computational costs and will be doing so over the next six months.

Question: If you have a high load, sending negative acknowledgements for retransmissions is pretty pointless. What about buffer overruns? Is your network perfect?
MMS: Modern protocols deal with flow control. We assume totally ordered, reliable group communications at a lower level. We are not directly concerned with lost messages.
XD: In practice we don't know if we're getting negative acknowledgements. However, we assume the peaks in our graphs indicate the presence of such.

Neil Mason: Would you care to speculate on the possibility of enhancing arbitrary components (eg: the ORB) by replacing them while the system was running?
MMS: That's equivalent to replacing a processor! It can't be done with slight of hand.

Question: Have you thought about experimenting with hard scheduling rather than soft scheduling?
MMS: I've spoken to Doug Schmidt but it's my opinion that it's a case of interfacing separate systems, not doing everything at the same time.

Question: What about real-time CORBA?
MMS: What is real-time CORBA going to be!?