Journal Articles




M. Santavy
Chemical Computing Group Inc.


Introduction

MOE was developed around an embedded programming language, the Scientific Vector Language (SVL), which is used by both applications developers and MOE users to extend and customize existing features of MOE, and to rapidly design, test, and deploy entirely new features and methodology.

SVL itself can be extended, by integrating into it external applications and existing in-house code. Although communication via files or pipes, already built into SVL, is often adequate for interfacing with external applications, a tighter integration of SVL features with the functionality of external routines is sometimes desirable. The SVL C Application Programming Interface (API) provides the highest level of integration and control, linking SVL to the external application at the object-code level.

The SVL C API is a set of functions (methods) for integrating the functionality of external applications or existing in-house code into SVL. The primary goal of the SVL C API is to allow the application programmer to extend the SVL system with features that access an external system or library of functions. This allows for a direct, fast, and efficient connection to existing external programs and routines. Once integrated, the external functions become part of the SVL system of built-in functions.

The SVL C API was designed to:

  • Minimize the possibility of inadvertent corruption of the SVL data space.
  • Properly handle errors.
  • Efficiently manage application-internal resources.
  • Efficiently manipulate character strings.
  • Easily implement vector unit extension.
  • Allow applications to work with both single and multiple data arguments.

API Overview

The primary function of the API is to allow the application programmer to build functions in C and to attach them to the SVL system. The API contains a set of methods to communicate between the internals of SVL and the external application's private functions or library routines. It provides a mechanism to attach the application's routines to SVL function names which can then be used to invoke these functions using regular SVL syntax. Further methods exist for examining the structure and contents of SVL function arguments and vector variables, and for transferring data between SVL vector buffers and the application's internal C structures. The API also contains methods to work with error conditions, SVL token types, and application resource allocation.

The API methods are used to access the input argument of the associated SVL function, to calculate and to store the result of the function, and to return an error code indicating what type of error, if any, has occurred. As an illustration, the following function calculates the length of a given vector (it behaves identically to the SVL built-in function "length"):

    svl_error api_MyLength (svl_task task)
    {
      svl_var var = svl_TaskVar(task); /* extract argument */
      int len = svl_Length(var); /* calculate its length */
      svl_error e = svl_Put_i(var, 0, len); /* store the result */
      return e; /* return the error code */
    }

Function Binding. A C function is attached to an SVL function symbol in the initialization portion of the application code. For example, the following call, made in a special API initialization function, will attach the function "api_MyLength" to the SVL symbol "MyLength":

    svl_MustAddApiFunc("MyLength", api_MyLength, 0);

New constants can also be created, by attaching values to SVL symbols. For example, the following call will attach the value of 22/7 = 3.14286... to the SVL symbol "MyPi":

    svl_MustMakeConst_R("MyPi", 22.0 / 7.0);

Execution Threads: Tasks. Every execution of an SVL program is carried out with the executable code, the execution states, the temporary variables, and the program resources all stored in a general container called an SVL task. The task itself uses a single special container - an SVL variable - for storing both the input and the output of the currently executing SVL function.

Generally, the argument/result is either a single item or a list of items, each of a very simple shape: either a scalar of a given type or a flat array of data of the same, given type. The API provides a wealth of methods to retrieve from and to store to SVL variables both scalar items and flat vectors of data. These methods can be further nested, which allows the programmer to traverse and read from or to create and write to variables of arbitrary shape.

Getting and Putting Data. Two families of access methods exist. The svl_Get... methods move data from SVL variables to C structures. The svl_Put... methods move data in the opposite way, from C structures to SVL variables.

Each Get/Put method works on a specified element of a given variable, copying the contents of that element to or from a specified C structure, which can be either a single variable or a C array. The data type is indicated by the suffix of each function name, which also determines the expected or the resulting type of the SVL variable.

The following call is an example of the use of a Get function. It retrieves a vector of 10 floats from the second element of SVL variable v, and stores them into buffer buff:

    svl_Get_f_(v, 2, buff, 10);

If the second element of v contains ten numbers, the ten values will be converted to the C type "float" and copied to buff. If the second element of v does not contain ten numbers, the svl_Get_f_() function will return an error condition, allowing the API programmer to prevent further evaluation. SVL functions and expressions that work with multiple vectors often require that the argument vectors be of the same length. In such cases, it is an error to have vectors of unequal length. There is one notable exception: unequal-length vectors are permitted as long as the non-conforming vector(s) is a "unit", i.e., of length 1. A unit vector automatically extends to match the length of another vector. For example, the expression [1,2,3] + 1 is interpreted as [1,2,3] + [1,1,1]. This behavior is called unit-extension.

The API allows the programmer to write unit-extending functions with very little effort, with most API methods providing for the unit-extension automatically. For example, if the variable v in the call to "svl_Get_f_" above contains a single number (rather than a vector of ten numbers), the buffer buff will be automatically filled with ten copies of that number.

Errors. Every SVL function must be prepared to deal with errors. Since SVL is used both as a programming language and as an interactive command language, it is quite common that a function is presented with a "wrong" argument such as an invalid expression or variable name. The SVL system must recover from such errors gracefully, without crashing or leaking memory.

The API solves the problem of error detection and recovery by having all functions that may fail report an error condition. There is a strict convention on how to use the error codes to ensure that all errors are properly detected and dealt with. An error condition is passed by an error handle to the top-level calling function, which displays the text describing the error.

In the example to follow, a built-in function reads the contents of the function argument as a C-string of maximum 30 characters. The characters are converted to upper-case and returned.

When the argument to a function is not of the desired type or length, or when the result cannot be stored because no more memory can be allocated by the machine, the function will fail. The error condition arising from such a failure is passed to the caller of the failing function, which will stop further execution and return the error condition to its own caller.

    svl_error api_ToUpper(svl_task)
    {
      svl_error e = NULL; /* the error condition */
      svl_var v = svl_TaskVar(task) /* the input/output variable */
      char *c, str[30];

      E_(svl_Get_string(v, 0, str, 30)); /* extract the argument */
      for (c = str; *c; c++)
        *c = toupper(*c); /* calculate the result */
      E_(svl_Put_string(v, 0, str)); /* store the result */
    X_:
      return e; /* return the error condition */
    }

Note that the user-defined macro E_(expr) is defined as "if (expr) goto X_; else" and is used to detect and return errors.

Resource Management. The API provides methods to manage memory buffers, SVL tokens, and other resources. Typically, a pointer or a handle is initialized in the initialization section of the routine to a specific value, such as NULL or svl_nullTok (the null token). The pointer may subsequently be reset, reallocated, or modified in the body of the routine. With each manipulation of the pointer, the error condition must be checked, and, in the case of failure, the control passed immediately to the exit section of the routine, where any private data structures are cleared and any associated buffers freed before returning the error condition to the calling function.

In the example "api_ToUpper" above, the maximum allowed length of the extracted string is 30 characters. Suppose we wish to allow strings of arbitrary length. We must then allocate a buffer of the right size to hold the string. First, we extract the argument as a token (an SVL construct which can hold character strings of arbitrary length). Then, knowing the number of characters in the token, we allocate a buffer to hold the string for conversion to upper-case.

    svl_error api_ToUpper(svl_task)
    {
      svl_error e = NULL;
      svl_var v = svl_TaskVar(task)
      char *c, *str = NULL; /* initialize to NULL */
      svlTok tok = svl_nullTok; /* initialize to svl_nullTok; */
      int len; /* length of the string */

      E_(svl_Get_T(v, 0, &tok)); /* extract the argument */
      len = svl_LenTok(tok);

      E_(svl_MallocE(&str, len+1)); /* allocate the buffer */
      memcpy(str, tok, len+1); /* copy the string */
      for (c = str; *c; c++) *c = toupper(*c); E_(svl_ReallocnTokE(&tok, str, len)); /* convert to token */

      E_(svl_Put_T(v, 0, tok)); /* store the result */

    X_:
      svl_Free(str); /* free the buffer */
      svl_FreeTok(tok); /* free the token */
      return e; /* return the error condition */
    }

String Manipulation. Token (strings of characters treated as a unit) handles provide API programmers with a means for very fast and efficient searches and table look-ups of strings. They can be compared directly, without searching the associated strings. Moreover, they can be used to generate hash keys for fast, direct-access association tables.

More advanced features. Typically, when working with the API, it is sufficient to use flat vectors stored in the individual elements of the argument or the result of the built-in function. Sometimes, however, more complex structures of SVL variables must be examined or created. To this end, the API provides nested, context-sensitive versions Get and Put methods. These will be treated in detail in a future article.

Many applications use internal resources, such as files, databases, memory buffers, or graphic objects. The API offers the application programmer a simple and effective means by which to manage the application's resources - "task memo" methods. These allow the application to allocate and free resources in response to the actual need for them.

Summary

The SVL C API allows tight integration of SVL with existing in-house code and libraries. It was designed for simplicity of use, for efficiency, and, at the same time, for flexibility to satisfy sophisticated needs. It was also designed to provide a cushion of safety for the application programs.

Safety is of critical importance. If the SVL internal structures become corrupted as a result of erroneous manipulation by a single function, the whole system (not just the function in question) fails. The API's methods, types, and contexts make it relatively difficult for an application routine to misuse the API methods and unwittingly corrupt the application.

All of these features combine to create a highly usable C API. Indeed, the Chemical Computing Group development teams use the SVL C API internally to develop new built-in functionality: the SVL C API contains the tools needed for developing serious, full-scale applications. A non-trivial, but straightforward example of programming with the C API is the interface to the Daylight Toolkits and is distributed with MOE 1997.09.