diff options
author | john_c <john_c@ae88bc3d-4319-0410-8dbf-d08b4c9d3795> | 2004-11-19 22:14:54 +0000 |
---|---|---|
committer | john_c <john_c@ae88bc3d-4319-0410-8dbf-d08b4c9d3795> | 2004-11-19 22:14:54 +0000 |
commit | e7acc2ff70160f040560752748711bf7de42ef37 (patch) | |
tree | eede6298a1baa4cbfe0b8d96d1c3e5efa0b767e3 /TAO/docs | |
parent | ec699873090bcaac3937a94ed17a0c3d65c9b5e4 (diff) | |
download | ATCD-e7acc2ff70160f040560752748711bf7de42ef37.tar.gz |
Thu Nov 18 12:39:59 2004 Ciju John <john_c@ociweb.com>
Fri Oct 29 10:53:56 2004 Dale Wilson <wilson_d@ociweb.com>
Wed Oct 27 11:59:01 2004 Dale Wilson <wilson_d@ociweb.com>
Mon Oct 25 20:41:00 2004 Dale Wilson <wilson_d@ociweb.com>
Mon Oct 25 14:51:09 2004 Dale Wilson <wilson_d@ociweb.com>
Wed Oct 20 11:38:11 2004 Dale Wilson <wilson_d@ociweb.com>
Tue Oct 19 10:43:28 2004 Dale Wilson <wilson_d@ociweb.com>
Mon Oct 18 15:21:49 2004 Dale Wilson <wilson_d@ociweb.com>
Mon Oct 18 10:29:48 2004 Dale Wilson <wilson_d@ociweb.com>
Mon Oct 18 10:11:47 2004 Dale Wilson <wilson_d@ociweb.com>
Tue Oct 12 14:10:43 2004 Dale Wilson <wilson_d@ociweb.com>
Mon Oct 11 14:39:15 2004 Dale Wilson <wilson_d@ociweb.com>
Thu Oct 7 09:40:51 2004 Dale Wilson <wilson_d@ociweb.com>
Mon Oct 18 13:02:11 2004 Dale Wilson <wilson_d@ociweb.com>
ed Oct 13 15:44:58 2004 Dale Wilson <wilson_d@ociweb.com>
Diffstat (limited to 'TAO/docs')
-rw-r--r-- | TAO/docs/notification/reliability.html | 346 |
1 files changed, 346 insertions, 0 deletions
diff --git a/TAO/docs/notification/reliability.html b/TAO/docs/notification/reliability.html new file mode 100644 index 00000000000..44c9c198035 --- /dev/null +++ b/TAO/docs/notification/reliability.html @@ -0,0 +1,346 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> +<html> + <head> + <title>Using the Reliable Notification Service</title> + <meta content="False" name="vs_snapToGrid"> + <meta content="False" name="vs_showGrid"> + <!-- $Id$ --> + </head> + <body> + <h1>Using the Reliable Notification Service</h1> + <h2>Background</h2> + <p>There are two CORBA services defined by the OMG to support the + Supplier/Consumer design pattern. This pattern allows messages (known as + Events in this context) to be generated by one or more suppliers and delivered + to one or more consumers without requiring that the suppliers and consumers + have any knowledge of each other. </p> + <P>The Event Service provides a basic implementation of this pattern, and the + Notification service extends this basic service to support a rich variety of + optional features.</P> + <h2>Reliability and Persistence</h2> + <p>One of the optional features of the Notification service is Reliability. + By default the Event Service and the Notification service provide a <EM>best-effort</EM> + support for event delivery. If things go wrong -- program crashes, + communications failures, etc. events may be lost without notice.</p> + <P>There are some circumstances in which losing events is not + acceptable. The Notification service may be used for these situations if + it is configured for reliable operation. Reliable operation is not + available in the Event Service. Reliable operation means information is + saved persistently (usually on a disk file) and used to recover from the + various failures that might otherwise lead to loss of data.</P> + <P>There are two separate, but related, issues that need to be addressed to + provide reliable event delivery: topology persistence an event + persistence.</P> + <P>To provide topology persistence, sometimes called connection persistence, the + Notification service must keep track of what clients (Suppliers and Consumers) + have connected to the Notification service and what options have been specified + to contol the delivery of events.</P> + <P>To provide event persistence the Notification service tracks each event in + persistent storage to be sure it is delivered to every consumer that should + receive it. + </P> + <P>There may be situations in which topology persistence is all that is necessary + -- it may be acceptable to lose events during a failure as long as + the system is restored to normal operation afterward. Event persistence + on the other hand can only be supported if topology persistence is also being + used. It doesn't help to keep track of events if the system is unable to + find the consumers to which the events should be delivered.</P> + <P>Two separate issues must be addressed as part of setting up the Notifcation + for reliable operation. At the system administration level the + Notification service must be configured for topology persistence and + possibly for event persistence. At the application level, programs + that operate as consumers and suppliers must set the appropriate parameters to + enable reliable operation, and must cooperate with the reconnection process + that occurs during topology recovery.</P> + <h2>Configuring Notification Service Reliability.</h2> + <h3>Service Configurator Changes</h3> + <P>Runtime configuration of the Notification Service is supported through the + service configurator file. This file is normally named svc.conf; however the + -ORBSvcConf command line option allows an alternate service configuration file + to be specified. + </P> + <P> + Service configuration changes to support Notification Service reliability + include a new option on the existing <code>Notify_Default_Event_Manager_Objects_Factory</code> + service configuration command, and two new service configuration commands. + </P> + <H4>Notify_Default_Event_Manager_Objects_Factory option: -AllowReconnect</H4> + <p>Certain recovery cases require that a Consumer be able to reconnect to an + existing proxy object in the Notification Service in order to receive all + events delivered by that proxy object. This behavior is a departure from the + OMG Specification which mandates that the Notification Service should throw an + "Already Connected" exception when a consumer attempts to connect to a proxy + that was previously used by another Consumer. + </p> + <p>A new option, -AllowReconnect, is available for the existing <code>Notify_Default_Event_Manager_Objects_Factory + </code>command to support this requirement. As an example of its use, the + following line configures the Notification Service for multi-threaded operation + supporting reconnection.</p> + <code>static Notify_Default_Event_Manager_Objects_Factory "-DispatchingThreads 2 + -SourceThreads 2 -AllowReconnect" </code> + <H3>Configuring Connection (Topologogy) Reliability</H3> + <p>The support for persistent topology is actually a configurable strategy. + TAO includes an XML Topology Persistence Strategy that uses an XML file for + persistent storage, but it it is designed to allow other strategies to be + developed. For example if topology information should be stored in a + relational database file, it is possible to develop a persistent topology + strategy to do so. The details of doing this are beyond the scope of this + document. + </p> + <P>This document describes how to configure the XML topology persistence included + with TAO.</P> + <P>An example of the service configuration command to configure the XML + strategy is: + </P> + <p><code>dynamic Topology_Factory Service_Object* + TAO_CosNotification_persist:_make_XML_Topology_Factory() "-base_path ./reconnect_test" </code> + </p> + <p>The first part of this line: <code>dynamic Topology_Factory Service_Object* + TAO_CosNotification_persist:_make_XML_Topology_Factory()</code>should be given + exactly as shown. For details on this syntax, see chapter 17 of the TAO + Developer's Guide. + </p> + <P>The quoted string at the end of the line contain arguments for the configured + strategy. The arguments recognized by the XML topology strategy implemented in + this project are: + </P> + <ul> + <li> + -v + <li> + -base_path <EM>file_path</EM> + <li> + -backup_count <EM>count</EM> + <li> + -save_base_path <EM>file_path</EM> + <li> + -load_base_path <EM>file_path</EM> + <li> + <H4>-no_timestamp + </H4> + </li> + </ul> + <H4>Topology_Factory Option: -v</H4> + To help diagnose and/or document svc.conf settings, the "-v" will cause the + options for the Topology_Factory to be displayed as they are interpreted + <H4>Topology_Factory Option: -base_path file_path + </H4> + <P>The argument for this option is a fully qualified path name without an + extension for the xml file in which topology information is saved. Three + extensions will be appended to this name: .new, .xml, and .000 + </P> + <P>Saved topology information will be written to <EM>file_path</EM>.new file. + Information with a .new extension is not necessarily complete and will not be + used to restore the topology. + </P> + <P>When the .new file is complete, the previous <EM>file_path</EM>.000 (if any) + will be deleted, the previous <EM>file_path</EM>.xml (if any) will be renamed + as <EM>file_path</EM>.000 and the <EM>file_path</EM>.new file will be renamed + as file_path.xml. The assumption is that a file system rename operation is + atomic. If this assumption holds than at any time the file <EM>file_path</EM>.xml + (if it exists) contains the most recent complete save. If <EM>file_path</EM>.xml + does not exist then <EM>file_path</EM>.000 contains the most recent complete + save. If neither of these files exist the saved topology information is not + available. + </P> + <H4>Topology_Factory Option: -backup_count count</H4> + <P>This option modifies the behavior described in the preceeding section to allow + additional backup copies of the topology file to be retained. The default + value, 1, means that only the <EM>file_path</EM>.000 file will be kept. If a + higher number is specified, then older versions will be kept. Rather than + deleting <EM>file_path</EM>.000, the system will rename it to be <EM>file_path.</EM>001. + Older versions will be named <EM>file_path</EM>.002, <EM>file_path</EM>.002 and + so on. + </P> + <P>Under normal circumstances only one backup file is required -- in fact these + additional backup files will not be used to restore the topoogy. However + setting this number to a larger value lets the system keep a brief history of + topology changes. Since the XML files are roughly human-readable this can be + used as a diagnostic tool for problems related to Notification Service + topology. + </P> + <H4>Topology_Factory Options: -save_base_path file_path and -load_base_path + file_path + </H4> + <P>These options are alternatives to the -base_path option. They allow the file + from which topology information is loaded at Notification Service startup time + to be different from the file to which this information is saved as the system + runs. + </P> + <P>This option is mostly used for developer testing, a system administrator may + find an interesting use for this option -- possibly involving script files that + rename the XML files during recovery from a Notification Service failure. + </P> + <H4>Topology_Factory Option: -no_timestamp</H4> + <P>The XML files include a timestamp to indicate when the information was saved. + The timestamp is for information only and is not needed for correct functioning + of the topology persistence. This option suppresses that timestamp. Doing so + makes it possible to compare XML files using a program like diff to see if the + files represent the same topology. + </P> + <P>This option is intended primarily for testing the persistent topology + implementation. + </P> + <h3>Configuring Event Reliability</h3> + <p>A service configuraton new object, "Event_Persistence", can be configured in + the service configuration file to enable and configure the Event Reliability. + An example of the line needed to configure the Event_Persistence object is: + </p> + <p><code>dynamic Event_Persistence Service_Object* + TAO_CosNotification_persist:_make_Standard_Event_Persistence() "-v -file_path + ./event_persist.db" </code> + </p> + <p><CODE></CODE>If this line does not appear in svc.conf, then event reliability + will not be supported. QoS parameters for reliable event delivery will be + silently ignored when Event Reliability is not configured. Event reliability + also requires topology reliability, so if this line appears there must also be + a "Topology_Factory" line in the file. If not, the Notification Service will + fail to start up. + </p> + <P>The beginning of this line, up to and including the parentheses, should appear + exactly as shown. For details on this syntax, see chapter 17 of the TAO + Developer's Guide. The quoted string at the end of the line contains options + for Event_Persistence. + </P> + <h4>Event_Persistence Option: -v</h4> + <p>This option and any option that appears after this option will be written to + the log (normally the console) as it is processed. This is intended to help + diagnose and document the Event Persistence settings. The default is to + configure Event Persistence silently. + </p> + <h4>Event_Persistence Option: -file_path path + </h4> + <p>This option gives the completely qualified name for the file in which + persistent event information will be stored. The file should be configured on a + reliable device that supports synchronized writes (i.e. flushing the operating + system's write cache.) A device that is suitable for storing a reliable + database would be appropriate for storing this file. The file will be subject + to a relatively high number of small (single block) write requests, but very + few, if any, read requests. If the file does not exist, then a new file will be + created. If the file does exist, and if topology is successfully loaded, the + events from this file will be reloaded and redelivered automatically. This is a + required option. There is no default value. + </p> + <h4>Event_Persistence Option: -block_size n + </h4> + <p>This option gives the block size in bytes for the device on which the event + reliability file is stored. For both performance and reliability reasons it is + important that the value matches the physical characteristics of the device. + The default value is 512. + </p> + <h2>Application Programming Changes to Support Reliability</h2> + <p> + When it is configured as described above, the Notification service + supports reliable connectivity and/or event delivery. + Actually achieving such reliability, however, requires cooperation from the + Notification service clients (Suppliers and Consumers). + <P> + There are a number of failure possibilities and different recovery techniques + are needed to handle them. The simplest case is when a client + fails and is restarted. + <P> + The Notification service will have maintained the connection points (Supplier + and Consumer Admins, Proxy Consumers, Proxy Admins, etc.) As each of these + connections was established, an ID returned by the notification + service. An application that wishes to be reconnected after a failure + should save a persistent copy of these IDs. For example, it could write + the IDs to a file, then read them back from the file after restarting. + Using these ID's the application can reconnect to the existing connection + points in the Notification service. The reconnection to the Proxy objects + will only work if the Notification service has been configured with the + -AllowReconnection option described above, but otherwise this process is fairly + straightforward. + <P> + As soon as a supplier has reconnected, it can resume sending events. As + soon as a consumer has reconnected, persistent events (if any) and new events + will start to arrive. + <P> + Notice that the identity of a consumer or supplier is determined by these saved + IDs. This is true even if the restarted client is running on a completely + different machine from the original client. + <P> + The case of the Notification service itself failing then being restarted on the + same or a different machine is somewhat more complicated. The + Notification service wasn't designed to initiate a connection to a + client. It must wait for the client to reconnect before it can start + accepting or delivering events. The difficulty is in having the client + know when to initiatie the reconnection, and to where the Notification service + is running in case it was necessary to move it to a new machine due to the + failure + <H3>Reconnection Registry</H3> + <p>The reconnection registry provides an answer to the question of how the client + knows where and when to reconnect to the Notification service. This + TAO-specific interface is implemented by the EventChannelFactory in the + reliable Notification Service. Clients can narrow the EventChannelFactory + object reference to a Reconnection Registery interface, then register a + Reconnection Callback object that will be notified when the Notification + service has restarted and is ready for reconections. The + EventChannelFactory passes its own object reference to the Reconnection + Callback object to inform the client where the Notification service is now + running.</p> + <P>The interfaces involved are defined in the NotifyExt.idl file (in + $TAO_ROOT/orbsvcs/orbsvcs) and are shown here:</P> + <pre> + /** * \brief An interface which gets registered with a + ReconnectionRegistry. + * * A supplier or consumer must implement this interface in order + to * allow the Notification Service to attempt to reconnect to it + after * a failure. The supplier or consumer must register its instance + of * this interface with the + ReconnectionRegistry. + */ interface + ReconnectionCallback + { /// Perform operations to reconnect to the Notification + Service /// after a + failure. void reconnect (in Object + + new_connection); /// Check to see if the ReconnectionCallback is alive + boolean is_alive (); + }; + + /** + * \brief An interface that handles registration of suppliers and consumers. + * + * This registry should be implemented by an EventChannelFactory and + * will call the appropriate reconnect methods for all ReconnectionCallback + * objects registered with it. + */ + interface ReconnectionRegistry + { + typedef unsigned long ReconnectionID; + ReconnectionID register_callback(in ReconnectionCallback reconection); + + void unregister_callback (in ReconnectionID id); + + /// Check to see if the ReconnectionRegistry is alive + boolean is_alive (); + }; + </pre> + <H3>Using Event Reliability</H3> + <P>Configuring the Notification service for reliable event delivery is necessary, + but not sufficient to enable reliable handling of events. The application + code in either the client or the server must configure the EventChannel through + which the events are delivered to operate in the reliable mode. This is + done by setting the QoSProperties named "ConnectionReliabilty" and + "EventReliability" to the value "persistent" -- either at the time the channel + is created or at a later time useing the set_qos method.</P> + <P>Once an channel has been configured for reliable operation, persistence can be + disabled on an event by event basis using QoSProperties of the event + itself. This could be none, for examlpe, to avoid the overhead of + persistently storing events for which reliability is not needed.</P> + <P>The supplier sends events to the EventChannel using a push() method. For + persistent events, this call will not return to the supplier until the + Notification service is prepared to guarantee event delivery. + </P> + <P>Application code in the Consumer should be written with the knowledge that + events are guaranteed to be delivered, but during recovery from a failure there + is a possiblity that an event may arrive more than once. This could + happen, for example if the event was in the process of being delivered at the + time the failure occurred and the failure prevents the Notfication service from + determining if the delivery completed successfully. To meet its + committment that every event will be delivered, the Notification service will + retry the delivery in this canse which may result in a duplicate event.</P> + <P>As long as this situation is understood at the time the application is + designed, it should be possible for the application to handle this situation.</P> + </body> +</html> |