PreviousNext

ET Programming Details

This chapter gives some details on programming with an ET system. It answers questions about program flow, handling signals, useful ET library functions, how to define user functions for selecting events, and various odds & ends.

Program Flow

Being such a complicated, multithreaded, multiprocess system, it is probably not at all obvious how a user would put all the calls to the ET library together in a coherent manner. Given below is a bare bones outline of how a local user's process should look.

/* declare variables */
int status;
et_statconfig sconfig;
et_openconfig openconfig;
et_event  *pe;
et_sys_id  id;
et_stat_id my_stat
et_att_id  attach;
 
/* define station */
et_station_config_init(&sconfig);
et_station_config_setblock(sconfig, ET_STATION_BLOCKING);
et_station_config_setselect(sconfig, ET_STATION_SELECT_ALL);
et_station_config_setuser(sconfig, ET_STATION_USER_SINGLE);
et_station_config_setrestore(sconfig, ET_STATION_RESTORE_OUT);
 
/* open ET system */
et_open_config_init(&openconfig);
et_open(&id," /tmp/my_et_system_file", openconfig);
et_open_config_destroy(openconfig);
 
/* create and attach to station */
et_station_create(id, &my_stat, "my_station", sconfig);
et_station_attach(id, my_stat, &attach);

while(et_alive(id)) {
    status = et_event_get(id, attach, &pe, ET_SLEEP, NULL);
    status = et_event_put(id, attach, pe);
}

Besides defining a station, the first thing to do is to initialize with et_open. This maps the given file into the user's memory giving access to the ET system. It also starts a heartbeat and begins to listen for the ET system's heartbeat. Even if the ET system should die and resurrect, this need not be repeated. However, after an et_close it will have to be repeated to regain access to the ET system.

Create any desired stations, then attach to one of them. By attaching, one receives a unique identifier (attach in this case). This will be used in the rest of the transactions.

Once finished attaching, one can read and write events, checking every now and then to see if the ET system is alive. If the ET system dies while the user is waiting to get events, the get call will return with the error ET_ERROR_WAKEUP. Although not shown in this code, be sure to carefully check the status of each read and write statement.

Handling Signals

Because the ET software uses multiple POSIX threads, signal handling must be done carefully. Be sure to use POSIX routines and only those that are thread safe. Refer to the book Programming with POSIX Threads by David Butenhof for a good reference on this subject.

Functions that meet this standard are pthread_sigmask, pthread_kill, sigwait, sigwaitinfo, and sigtimedwait. When masking signals, use the function pthread_sigmask NOT sigprocmask since its behavior in a threaded process is undefined.

The best way to handle things is to initially block or mask all signals with pthread_sigmask. Once the user has called et_open, the new threads that were started as a result of calling it will also have all signals blocked because the new threads inherit the signal mask of its parent thread. Once the ET system is open, handle the signal catching in the main thread or some additional thread spawned from the main thread (see et_client.c). If a separate signal handling thread is used, it can use sigwait to wait for specific signals. It is very convenient to do things this way, but care must be taken as the main thread continues execution even as the signal handler is being run.

Defining Functions for Event Selection

Should the user wish to provide an event selection capability for a station not already present in the ET system, this may be accommodated by defining a function especially for that purpose. The function must be part of a shared library and must have the arguments:

et_my_function (et_sys_id id, et_stat_id stat_id, et_event *pe) .

This function will be called whenever its associated station is collecting events to gather into its input list. The return value must be one for a selected event and zero otherwise.

The function-writer has access to the event's data through functions mentioned in the previous chapter, Similarly, there is access to information about the station's configuration through the following ET library functions:

1.      et_station_getattachments(et_sys_id id, et_stat_id stat_id, int *numatts) : gets the number of attachments to a station.

2.      et_station_getstatus(et_sys_id id, et_stat_id stat_id, int *status) : gets a station's status.

3.      et_station_getblock(et_sys_id id, et_stat_id stat_id, int *block) : gets a station's blocking mode

4.      et_station_getrestore(et_sys_id id, et_stat_id stat_id, int *restore) : gets a station's restore mode

5.      et_station_getuser(et_sys_id id, et_stat_id stat_id, int *user) : gets a station's user mode

6.      et_station_getprescale(et_sys_id id, et_stat_id stat_id, int *prescale) : gets a station's prescale value

7.      et_station_getcue(et_sys_id id, et_stat_id stat_id, int *cue) : gets a station's cue value

8.      et_station_getselect(et_sys_id id, et_stat_id stat_id, int *select) : gets a station's select mode

9.      et_station_getselectwords(et_sys_id id, et_stat_id stat_id, int *select) : gets a station's selection integer array

10.  et_station_getlib(et_sys_id id, et_stat_id stat_id, char *lib) : gets a station's shared library name

11.  et_station_getfunction(et_sys_id id, et_stat_id stat_id, char *function) : gets a station's function name

12.  et_station_getinputcount(et_sys_id id, et_stat_id stat_id, int *cnt) : gets the number of events in a station's input list. This function may not be so useful in that this data can change so quickly.

13.  et_station_getoutputcount(et_sys_id id, et_stat_id stat_id, int *cnt) : gets the number of events in a station's output list. This function may not be so useful in that this data can change so quickly.

Using these functions, all relevant information about the ET system necessary to select events for a particular station can be obtained.

Useful ET Library Functions

There are a number of other routines available to the ET system users. Use the following to get information about stations:

1.      et_station_name_to_id(et_sys_id id, et_stat_id *stat_id, char *name) : returns a station id given a station's name.

2.      et_station_isattached(et_sys_id id, et_stat_id stat_id, et_att_id att) : tells if "att" is attached to a station.

3.      et_station_exists(et_sys_id id, et_stat_id *stat_id, char *stat_name) : tells if a station exists and returns its id.

There are routines available to get information about an ET system:

1.      et_system_getnumevents(et_sys_id id, int *numevents) : tells how many events a system has.

2.      et_system_geteventsize(et_sys_id id, int *eventsize) : tells the size in bytes of a system's events.

3.      et_system_getlocality(et_sys_id id, int *locality) : tells whether the ET system is on a remote node or is local, or is local on a system which cannot share mutexes.

4.      et_system_getpid(et_sys_id id, pid_t *pid) : gives the unix process id or pid or the ET system process.

5.      et_system_getheartbeat(et_sys_id id, int *heartbeat) : tells the heartbeat count.

6.      et_system_getattsmax(et_sys_id id, int *attsmax) : tells the max number of attachments allowed..

7.      et_system_getstationsmax(et_sys_id id, int *stationsmax) : tells the max number of stations allowed.

8.      et_system_gettempsmax(et_sys_id id, int *tempsmax) : tells the max number of temporary events allowed.

9.      et_system_getprocsmax(et_sys_id id, int *procsmax) : tells the max number of processes allowed to open the ET system locally.

10.  et_system_getattachments(et_sys_id id, int *atts) : tells the current number of attachments.

11.  et_system_getstations(et_sys_id id, int *stations) : tells the current number of stations.

12.  et_system_gettemps(et_sys_id id, int *temps) : tells current number of temporary events.

13.  et_system_getprocs(et_sys_id id, int *procs) : tells the current number of processes with the ET system open locally.

14.  et_system_gethost(et_sys_id id, char *host) : tells which host computer the ET system is running on.

15.  et_system_getserverport(et_sys_id id, unsigned short *port) : tells the port number of  the ET system's TCP server thread.

Some routines affecting attachments are:

1.      et_wakeup_attachment(et_sys_id id, et_att_id att) : this routine wakes up a particular attachment which is currently blocked on an event read call on a particular station.

2.      et_wakeup_all(et_sys_id id, et_stat_id stat_id) : this routine wakes up all attachments which are currently blocked on an event read call on a particular station.

3.      et_attach_geteventsput(et_sys_id id, et_attt_id att_id, int *highint, int *lowint) : this routine gets the number of events put into a station by an attachment.. The are 2 integers (64 bits) of data returned.

4.      et_attach_geteventsget(et_sys_id id, et_attt_id att_id, int *highint, int *lowint) : this routine gets the number of events gotten from a station by an attachment.. The are 2 integers (64 bits) of data returned.

5.      et_attach_geteventsdump(et_sys_id id, et_attt_id att_id, int *highint, int *lowint) : this routine gets the number of events dumped by an attachment.. The are 2 integers (64 bits) of data returned.

6.      et_attach_geteventsmake(et_sys_id id, et_attt_id att_id, int *highint, int *lowint) : this routine gets the number of new events gotten from a station by an attachment.. The are 2 integers (64 bits) of data returned.

Then there are:

1.      et_alive(et_sys_id id) : this return 1 if the ET system is alive and 0 if it is not.

2.      et_wait_for_alive(et_sys_id id) : this waits indefinitely until the ET system is alive and then it returns.

How to Avoid Blocking Forever

Be careful when attaching to more than one station at a time. Multiple attachments and blocking stations are a bad combination. If one is reading and writing from a blocking station, there is the potential to lock up the whole ET system.

The problem arises when the read and write statements of a program are done serially in a single logical loop. Without going into the details, in some circumstances, events all pile up in the input list of one station while the user is waiting to read events from another station. Check your logic carefully.

Similar problems can arise when producing events at an attachment that is also being used for reading or consuming events. The difficulty is that if the user blocks when calling et_event_new, all the events may have previously piled up in the user's station's input list. In this situation the call to et_event_new will never return.

Includes, Flags, and Libraries

Using the ET system library functions requires users to include the file et.h in any programs, as in the following:

#include “et.h

The name of the ET shared library is libet.so, and the name of the static library is libet.a .

See the SConstruct file in the ET distribution for other possibly necessary libraries.

On both Solaris and Linux, pthread mutexes have the default behavior such that if a mutex is locked by some thread, any other thread may unlock it. This is non-portable behavior and must not be relied on according to the man pages. However, its use is very convenient when recovering from a crashed process which has locked one or more mutexes. The alternative method to recover from such situations is to re-initialize the locked mutexes. Such behavior can be implemented at compile time by specifying the flag "-DMUTEX_INIT".

Debug Output

To help in finding problems and finding out information about an active ET system, users can adjust the debug output printed by the system. The two routines used for this purpose are:

1.      et_system_setdebug(et_sys_id id, int debug) : sets the level of debug output desired.

2.      et_system_getdebug(et_sys_id id, int *debug) : gets a system's current debug level.

The possible values of the argument debug are:

·         ET_DEBUG_NONE - this value results in no output

·         ET_DEBUG_SEVERE - this value outputs only the most severe errors

·         ET_DEBUG_ERROR - this value outputs all errors

·         ET_DEBUG_WARN - this value outputs all errors and all warnings

·         ET_DEBUG_INFO - this value outputs everything including informational output

The debug level of an ET system or consumer defaults to ET_DEBUG_ERROR. Notice that the debug level of a system can only be set after the call to et_open or et_system_start.

Normally, by default, debug output is simply printed by means of printf statements. If the user wishes to use the coda routine daLogMsg to output debug messages, simply recompile ET with the flag -DWITH_DALOGMSG. Be sure to link with the library libcmlog.so when doing so.

Monitoring an ET System

There is a program provided to monitor an ET system. It simply maps the ET system into its memory if it's local or gets data over the network if remote and prints out the values that it reads there. If the reader does run into trouble, this program can help isolate any problems. The usage is:

et_monitor  -f <ET name> [-h] [-r] [-host <ET host>] [-t <time period (sec)>] [-p <ET server port>] [-u <udp port>]

          -host   ET system's host
          -f         ET system's (memory-mapped file) name
          -h        help
          -r        connect with local host as if remote
          -t         time period in seconds between updates
          -p        ET TCP server port

It defaults to the local host with a period of 5 seconds between updates. If the user wants the monitor to communicate with the ET system as if remote even if it's local, use the -r option. The value of <host> can be provided in various formats. It can be an IP address in dotted-decimal form, the name of the host with or without the domain, ".local" or "localhost" which means look locally only, ".remote" which means look remotely only, or ".anywhere" which means any local or remote node which responds.

PreviousNext