summaryrefslogtreecommitdiff
path: root/TAO/docs
diff options
context:
space:
mode:
authorjohn_c <john_c@ae88bc3d-4319-0410-8dbf-d08b4c9d3795>2004-11-19 22:14:54 +0000
committerjohn_c <john_c@ae88bc3d-4319-0410-8dbf-d08b4c9d3795>2004-11-19 22:14:54 +0000
commite7acc2ff70160f040560752748711bf7de42ef37 (patch)
treeeede6298a1baa4cbfe0b8d96d1c3e5efa0b767e3 /TAO/docs
parentec699873090bcaac3937a94ed17a0c3d65c9b5e4 (diff)
downloadATCD-e7acc2ff70160f040560752748711bf7de42ef37.tar.gz
Thu Nov 18 12:39:59 2004 Ciju John <john_c@ociweb.com>
Fri Oct 29 10:53:56 2004 Dale Wilson <wilson_d@ociweb.com> Wed Oct 27 11:59:01 2004 Dale Wilson <wilson_d@ociweb.com> Mon Oct 25 20:41:00 2004 Dale Wilson <wilson_d@ociweb.com> Mon Oct 25 14:51:09 2004 Dale Wilson <wilson_d@ociweb.com> Wed Oct 20 11:38:11 2004 Dale Wilson <wilson_d@ociweb.com> Tue Oct 19 10:43:28 2004 Dale Wilson <wilson_d@ociweb.com> Mon Oct 18 15:21:49 2004 Dale Wilson <wilson_d@ociweb.com> Mon Oct 18 10:29:48 2004 Dale Wilson <wilson_d@ociweb.com> Mon Oct 18 10:11:47 2004 Dale Wilson <wilson_d@ociweb.com> Tue Oct 12 14:10:43 2004 Dale Wilson <wilson_d@ociweb.com> Mon Oct 11 14:39:15 2004 Dale Wilson <wilson_d@ociweb.com> Thu Oct 7 09:40:51 2004 Dale Wilson <wilson_d@ociweb.com> Mon Oct 18 13:02:11 2004 Dale Wilson <wilson_d@ociweb.com> ed Oct 13 15:44:58 2004 Dale Wilson <wilson_d@ociweb.com>
Diffstat (limited to 'TAO/docs')
-rw-r--r--TAO/docs/notification/reliability.html346
1 files changed, 346 insertions, 0 deletions
diff --git a/TAO/docs/notification/reliability.html b/TAO/docs/notification/reliability.html
new file mode 100644
index 00000000000..44c9c198035
--- /dev/null
+++ b/TAO/docs/notification/reliability.html
@@ -0,0 +1,346 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
+<html>
+ <head>
+ <title>Using the Reliable Notification Service</title>
+ <meta content="False" name="vs_snapToGrid">
+ <meta content="False" name="vs_showGrid">
+ <!-- $Id$ -->
+ </head>
+ <body>
+ <h1>Using the Reliable Notification Service</h1>
+ <h2>Background</h2>
+ <p>There are two CORBA services defined by the OMG to support the
+ Supplier/Consumer design pattern.&nbsp; This pattern allows messages (known as
+ Events in this context) to be generated by one or more suppliers and delivered
+ to one or more consumers without requiring that the suppliers and consumers
+ have any knowledge of each other.&nbsp;</p>
+ <P>The Event Service provides a basic implementation of this pattern, and the
+ Notification service extends this basic service to support a rich variety of
+ optional features.</P>
+ <h2>Reliability and Persistence</h2>
+ <p>One of the optional features of the Notification service is Reliability.&nbsp;
+ By default the Event Service and the Notification service provide a&nbsp; <EM>best-effort</EM>
+ support for event delivery.&nbsp; If things go wrong -- program crashes,
+ communications failures, etc.&nbsp; events may be lost without notice.</p>
+ <P>There are some circumstances in which losing events is&nbsp; not
+ acceptable.&nbsp; The Notification service may be used for these situations if
+ it is configured for reliable operation.&nbsp; Reliable operation is not
+ available in the Event Service.&nbsp; Reliable operation means information is
+ saved persistently (usually on a disk file) and used to recover from the
+ various failures that might otherwise lead to loss of data.</P>
+ <P>There are two separate, but related, issues that need to be addressed to
+ provide reliable event delivery:&nbsp; topology persistence an event
+ persistence.</P>
+ <P>To provide topology persistence, sometimes called connection persistence, the
+ Notification service must keep track of what clients (Suppliers and Consumers)
+ have connected to the Notification service and what options have been specified
+ to contol the delivery of events.</P>
+ <P>To provide event persistence the Notification service tracks each event in
+ persistent storage to be sure it is delivered to every consumer that should
+ receive it.&nbsp;
+ </P>
+ <P>There may be situations in which topology persistence is all that is necessary
+ -- it&nbsp;may be&nbsp;acceptable to lose events during a failure as long as
+ the system is restored to normal operation afterward.&nbsp; Event persistence
+ on the other hand can only be supported if topology persistence is also being
+ used.&nbsp; It doesn't help to keep track of events if the system is unable to
+ find the consumers to which the events should be delivered.</P>
+ <P>Two separate issues must be addressed as part of setting up the Notifcation
+ for reliable operation.&nbsp; At the system administration level the
+ Notification&nbsp; service must be configured for topology persistence and
+ possibly for event persistence.&nbsp; At the application level,&nbsp;programs
+ that operate as consumers and suppliers must set the appropriate parameters to
+ enable reliable operation, and must cooperate with the reconnection process
+ that occurs during topology recovery.</P>
+ <h2>Configuring Notification Service Reliability.</h2>
+ <h3>Service Configurator Changes</h3>
+ <P>Runtime configuration of the Notification Service is supported through the
+ service configurator file. This file is normally named svc.conf; however the
+ -ORBSvcConf command line option allows an alternate service configuration file
+ to be specified.
+ </P>
+ <P>
+ Service configuration changes to support Notification Service reliability
+ include a new option on the existing&nbsp; <code>Notify_Default_Event_Manager_Objects_Factory</code>
+ service configuration command, and two new service configuration commands.
+ </P>
+ <H4>Notify_Default_Event_Manager_Objects_Factory option: -AllowReconnect</H4>
+ <p>Certain recovery cases require that a Consumer be able to reconnect to an
+ existing proxy object in the Notification Service in order to receive all
+ events delivered by that proxy object. This behavior is a departure from the
+ OMG Specification which mandates that the Notification Service should throw an
+ "Already Connected" exception when a consumer attempts to connect to a proxy
+ that was previously used by another Consumer.
+ </p>
+ <p>A new option, -AllowReconnect,&nbsp;is available for the existing <code>Notify_Default_Event_Manager_Objects_Factory
+ </code>command to support this requirement. As an example of its use, the
+ following line configures the Notification Service for multi-threaded operation
+ supporting reconnection.</p>
+ <code>static Notify_Default_Event_Manager_Objects_Factory "-DispatchingThreads 2
+ -SourceThreads 2 -AllowReconnect" </code>
+ <H3>Configuring Connection (Topologogy) Reliability</H3>
+ <p>The support for persistent topology is actually a configurable strategy.&nbsp;
+ TAO includes an XML Topology Persistence Strategy that uses an XML file for
+ persistent storage, but it it is designed to allow other strategies to be
+ developed.&nbsp; For example if topology information should be stored in a
+ relational database file, it is possible to develop a persistent topology
+ strategy to do so.&nbsp; The details of doing this are beyond the scope of this
+ document.
+ </p>
+ <P>This document describes how to configure the XML topology persistence included
+ with TAO.</P>
+ <P>An example of the&nbsp;service configuration command to&nbsp;configure the XML
+ strategy is:
+ </P>
+ <p><code>dynamic Topology_Factory Service_Object*
+ TAO_CosNotification_persist:_make_XML_Topology_Factory() "-base_path ./reconnect_test" </code>
+ </p>
+ <p>The first part of this line: <code>dynamic Topology_Factory Service_Object*
+ TAO_CosNotification_persist:_make_XML_Topology_Factory()</code>should be given
+ exactly as shown. For details on this syntax, see chapter 17 of the TAO
+ Developer's Guide.
+ </p>
+ <P>The quoted string at the end of the line contain arguments for the configured
+ strategy. The arguments recognized by the XML topology strategy implemented in
+ this project are:
+ </P>
+ <ul>
+ <li>
+ -v
+ <li>
+ -base_path <EM>file_path</EM>
+ <li>
+ -backup_count&nbsp;<EM>count</EM>
+ <li>
+ -save_base_path <EM>file_path</EM>
+ <li>
+ -load_base_path <EM>file_path</EM>
+ <li>
+ <H4>-no_timestamp
+ </H4>
+ </li>
+ </ul>
+ <H4>Topology_Factory Option: -v</H4>
+ To help diagnose and/or document svc.conf settings, the "-v" will cause the
+ options for the Topology_Factory to be displayed as they are interpreted
+ <H4>Topology_Factory Option: -base_path file_path
+ </H4>
+ <P>The argument for this option is a fully qualified path name without an
+ extension for the xml file in which topology information is saved. Three
+ extensions will be appended to this name: .new, .xml, and .000
+ </P>
+ <P>Saved topology information will be written to <EM>file_path</EM>.new file.
+ Information with a .new extension is not necessarily complete and will not be
+ used to restore the topology.
+ </P>
+ <P>When the .new file is complete, the previous <EM>file_path</EM>.000 (if any)
+ will be deleted, the previous <EM>file_path</EM>.xml (if any) will be renamed
+ as <EM>file_path</EM>.000 and the <EM>file_path</EM>.new file will be renamed
+ as file_path.xml. The assumption is that a file system rename operation is
+ atomic. If this assumption holds than at any time the file <EM>file_path</EM>.xml
+ (if it exists) contains the most recent complete save. If <EM>file_path</EM>.xml
+ does not exist then <EM>file_path</EM>.000 contains the most recent complete
+ save. If neither of these files exist the saved topology information is not
+ available.
+ </P>
+ <H4>Topology_Factory Option: -backup_count count</H4>
+ <P>This option modifies the behavior described in the preceeding section to allow
+ additional backup copies of the topology file to be retained. The default
+ value, 1, means that only the <EM>file_path</EM>.000 file will be kept. If a
+ higher number is specified, then older versions will be kept. Rather than
+ deleting <EM>file_path</EM>.000, the system will rename it to be <EM>file_path.</EM>001.&nbsp;
+ Older versions will be named <EM>file_path</EM>.002, <EM>file_path</EM>.002 and
+ so on.
+ </P>
+ <P>Under normal circumstances only one backup file is required -- in fact these
+ additional backup files will not be used to restore the topoogy.&nbsp; However
+ setting this number to a larger value lets the system keep a brief history of
+ topology changes. Since the XML files are roughly human-readable this can be
+ used as a diagnostic tool for problems related to Notification Service
+ topology.
+ </P>
+ <H4>Topology_Factory Options: -save_base_path file_path and -load_base_path
+ file_path
+ </H4>
+ <P>These options are alternatives to the -base_path option. They allow the file
+ from which topology information is loaded at Notification Service startup time
+ to be different from the file to which this information is saved as the system
+ runs.
+ </P>
+ <P>This option is mostly used for developer testing, a system administrator may
+ find an interesting use for this option -- possibly involving script files that
+ rename the XML files during recovery from a Notification Service failure.
+ </P>
+ <H4>Topology_Factory Option: -no_timestamp</H4>
+ <P>The XML files include a timestamp to indicate when the information was saved.
+ The timestamp is for information only and is not needed for correct functioning
+ of the topology persistence. This option suppresses that timestamp. Doing so
+ makes it possible to compare XML files using a program like diff to see if the
+ files represent the same topology.
+ </P>
+ <P>This option is intended primarily for testing the persistent topology
+ implementation.
+ </P>
+ <h3>Configuring Event Reliability</h3>
+ <p>A service configuraton new object, "Event_Persistence", can be configured in
+ the service configuration file to enable and configure the Event Reliability.
+ An example of the line needed to configure the Event_Persistence object is:
+ </p>
+ <p><code>dynamic Event_Persistence Service_Object*
+ TAO_CosNotification_persist:_make_Standard_Event_Persistence() "-v -file_path
+ ./event_persist.db" </code>
+ </p>
+ <p><CODE></CODE>If this line does not appear in svc.conf, then event reliability
+ will not be supported. QoS parameters for reliable event delivery will be
+ silently ignored when Event Reliability is not configured. Event reliability
+ also requires topology reliability, so if this line appears there must also be
+ a "Topology_Factory" line in the file. If not, the Notification Service will
+ fail to start up.
+ </p>
+ <P>The beginning of this line, up to and including the parentheses, should appear
+ exactly as shown. For details on this syntax, see chapter 17 of the TAO
+ Developer's Guide. The quoted string at the end of the line contains options
+ for Event_Persistence.
+ </P>
+ <h4>Event_Persistence Option: -v</h4>
+ <p>This option and any option that appears after this option will be written to
+ the log (normally the console) as it is processed. This is intended to help
+ diagnose and document the Event Persistence settings. The default is to
+ configure Event Persistence silently.
+ </p>
+ <h4>Event_Persistence Option: -file_path path
+ </h4>
+ <p>This option gives the completely qualified name for the file in which
+ persistent event information will be stored. The file should be configured on a
+ reliable device that supports synchronized writes (i.e. flushing the operating
+ system's write cache.) A device that is suitable for storing a reliable
+ database would be appropriate for storing this file. The file will be subject
+ to a relatively high number of small (single block) write requests, but very
+ few, if any, read requests. If the file does not exist, then a new file will be
+ created. If the file does exist, and if topology is successfully loaded, the
+ events from this file will be reloaded and redelivered automatically. This is a
+ required option. There is no default value.
+ </p>
+ <h4>Event_Persistence Option: -block_size n
+ </h4>
+ <p>This option gives the block size in bytes for the device on which the event
+ reliability file is stored. For both performance and reliability reasons it is
+ important that the value matches the physical characteristics of the device.
+ The default value is 512.
+ </p>
+ <h2>Application Programming Changes to Support Reliability</h2>
+ <p>
+ &nbsp;When it is configured as described above, the Notification service
+ supports reliable connectivity and/or&nbsp; event delivery.&nbsp;&nbsp;&nbsp;
+ Actually achieving such reliability, however, requires cooperation from the
+ Notification service clients (Suppliers and Consumers).
+ <P>
+ There are a number of failure possibilities and different recovery techniques
+ are needed to handle them.&nbsp; The simplest case is when a client
+ fails&nbsp;and is restarted.&nbsp;
+ <P>
+ The Notification service will have maintained the connection points (Supplier
+ and Consumer Admins, Proxy Consumers, Proxy Admins, etc.) As each of these
+ connections was established, an&nbsp;ID returned by the notification
+ service.&nbsp; An application that wishes to be reconnected after a failure
+ should save a persistent copy of these IDs.&nbsp; For example, it could write
+ the IDs to a file, then read them back from the file after restarting.&nbsp;
+ Using these ID's the application can reconnect to the existing connection
+ points in the Notification service.&nbsp; The reconnection to the Proxy objects
+ will only work if the Notification service has been configured with the&nbsp;
+ -AllowReconnection option described above, but otherwise this process is fairly
+ straightforward.
+ <P>
+ As soon as a supplier has reconnected, it can resume sending events.&nbsp; As
+ soon as a consumer has reconnected, persistent events (if any) and new events
+ will start to arrive.
+ <P>
+ Notice that the identity of a consumer or supplier is determined by these saved
+ IDs.&nbsp; This is true even if the restarted client is running on a completely
+ different machine from the original client.
+ <P>
+ The case of the Notification service itself failing then being restarted on the
+ same or a different machine is somewhat more complicated.&nbsp; The
+ Notification service wasn't designed to initiate a connection to a
+ client.&nbsp; It must wait for the client to reconnect before it can start
+ accepting or delivering events.&nbsp; The difficulty is in having the client
+ know when to initiatie the reconnection, and to where the Notification service
+ is running in case it was necessary to move it to a new machine due to the
+ failure
+ <H3>Reconnection Registry</H3>
+ <p>The reconnection registry provides an answer to the question of how the client
+ knows where and when to reconnect to the Notification&nbsp; service.&nbsp; This
+ TAO-specific interface is implemented by the EventChannelFactory in the
+ reliable Notification Service.&nbsp; Clients can narrow the EventChannelFactory
+ object reference to a Reconnection Registery interface, then register a
+ Reconnection Callback object that will be notified when the Notification
+ service has restarted and is ready for reconections.&nbsp; The
+ EventChannelFactory passes its own object reference to the Reconnection
+ Callback object to inform the client where the Notification service is now
+ running.</p>
+ <P>The interfaces involved are defined in the NotifyExt.idl file (in
+ $TAO_ROOT/orbsvcs/orbsvcs) and are shown here:</P>
+ <pre>
+ /** * \brief An interface which gets registered with a
+ ReconnectionRegistry.
+ * * A supplier or consumer must implement this interface in order
+ to * allow the Notification Service to attempt to reconnect to it
+ after * a failure. The supplier or consumer must register its instance
+ of * this interface with the
+ ReconnectionRegistry.
+ */ interface
+ ReconnectionCallback
+ { /// Perform operations to reconnect to the Notification
+ Service /// after a
+ failure. void reconnect (in Object
+
+ new_connection); /// Check to see if the ReconnectionCallback is alive
+ boolean is_alive ();
+ };
+
+ /**
+ * \brief An interface that handles registration of suppliers and consumers.
+ *
+ * This registry should be implemented by an EventChannelFactory and
+ * will call the appropriate reconnect methods for all ReconnectionCallback
+ * objects registered with it.
+ */
+ interface ReconnectionRegistry
+ {
+ typedef unsigned long ReconnectionID;
+ ReconnectionID register_callback(in ReconnectionCallback reconection);
+
+ void unregister_callback (in ReconnectionID id);
+
+ /// Check to see if the ReconnectionRegistry is alive
+ boolean is_alive ();
+ };
+ </pre>
+ <H3>Using&nbsp;Event Reliability</H3>
+ <P>Configuring the Notification service for reliable event delivery is necessary,
+ but not sufficient to enable reliable handling of events.&nbsp; The application
+ code in either the client or the server must configure the EventChannel through
+ which the events are delivered to operate in the reliable mode.&nbsp; This is
+ done by setting the QoSProperties named "ConnectionReliabilty" and
+ "EventReliability" to the value "persistent" -- either at the time the channel
+ is created or at a later time useing&nbsp; the set_qos method.</P>
+ <P>Once an channel has been configured for reliable operation, persistence can be
+ disabled on an event by event basis using QoSProperties of the event
+ itself.&nbsp; This could be none, for examlpe, to avoid the overhead of
+ persistently storing events for which reliability is not needed.</P>
+ <P>The supplier sends events to the EventChannel using a push() method.&nbsp; For
+ persistent events, this call will not return to the supplier until the
+ Notification service is prepared to guarantee event delivery.&nbsp;
+ </P>
+ <P>Application code in the Consumer should be written with the knowledge that
+ events are guaranteed to be delivered, but during recovery from a failure there
+ is a possiblity that an event may arrive more than once.&nbsp; This could
+ happen, for example if the event was in the process of being delivered at the
+ time the failure occurred and the failure prevents the Notfication service from
+ determining if the delivery completed successfully.&nbsp; To meet its
+ committment that every event will be delivered, the Notification service will
+ retry the delivery in this canse which may result in a duplicate event.</P>
+ <P>As long as this situation is understood at the time the application is
+ designed, it should be possible for the application to handle this situation.</P>
+ </body>
+</html>