Wednesday, October 12, 2011

The Essential RPG IV Style Guide

Professional programmers appreciate the importance of standards in developing programs that are readable, understandable, and maintainable. The issue of programming style goes beyond any one language, but the introduction of the RPG IV syntax demands that we reexamine standards of RPG style. With that in mind, here are some simple rules of thumb you can use to ensure that bad code doesn't happen to good RPG software construction.

Use comments to clarify - not echo - your code.
Comments that merely repeat the code add to a program's bulk, but not to its value. In general, you should use comments for just three purposes:
to provide a brief program or procedure summary
to give a title to a subroutine, procedure, or other section of code
to explain a technique that isn't readily apparent by reading the source

Always include a brief summary at the beginning of a program or procedure.
This prologue should include the following information:
a program or procedure title
a brief description of the program's or procedure's purpose
a chronology of changes that includes the date, programmer name, and purpose of each change
a summary of indicator usage
a description of the procedure interface (the return value and parameters)
an example of how to call the procedure

Use consistent "marker line" comments to divide major sections of code.
For example, you should definitely section off with lines of dashes or asterisks the declarations, the main procedure, each subroutine, and any subprocedures. Identify each section for easy reference.

Use blank lines to group related source lines and make them stand out.
In general, you should use completely blank lines instead of blank comment lines to group lines of code, unless you're building a block of comments. Use only one blank line, though; multiple consecutive blank lines make your program hard to read.

Avoid right-hand "end-line" comments in columns 81-100.
Right-hand comments tend simply to echo the code, can be lost during program maintenance, and can easily become "out of synch" with the line they comment. If a source line is important enough to warrant a comment, it's important enough to warrant a comment on a separate line. If the comment merely repeats the code, eliminate it entirely.

Declarations


With RPG IV, we finally have an area of the program source in which to declare all variables and constants associated with the program. The D-specs organize all your declarations in one place.

Declare all variables within D-specs.
Except for key lists and parameter lists, don't declare variables in C-specs - not even using *LIKE DEFN. Define key lists and parameter lists in the first C-specs of the program, before any executable calculations.

Whenever a literal has a specific meaning, declare it as a named constant in the D-specs.
This practice helps document your code and makes it easier to maintain. One obvious exception to this rule is the allowable use of 0 and 1 when they make perfect sense in the context of a statement. For example, if you're going to initialize an accumulator field or increment a counter, it's fine to use a hard-coded 0 or 1 in the source.

Indent data item names to improve readability and document data structures.
Unlike many other RPG entries, the name of a defined item need not be left-justified in the D-specs; take advantage of this feature to help document your code:
D ErrMsgDSDS DS
D ErrPrefix 3
D ErrMsgID 4
D ErrMajor 2 OVERLAY(ErrMsgID:1)
D ErrMinor 2 OVERLAY(ErrMsgID:3)

Use length notation instead of positional notation in data structure declarations.
D-specs let you code fields either with specific from and to positions or simply with the length of the field. To avoid confusion and to better document the field, use length notation consistently. For example, code
D RtnCode DS
D PackedNbr 15P 5
instead of
D RtnCode DS
D PackedNbr 1 8P 5

Use positional notation only when the actual position in a data structure is important.
For example, when coding the program status data structure, the file information data structure, or the return data structure from an API, you'd use positional notation if your program ignores certain positions leading up to a field or between fields. Using positional notation is preferable to using unnecessary "filler" variables with length notation:
D APIRtn DS
D PackedNbr 145 152P 5
In this example, to better document the variable, consider overlaying the positionally declared variable with another variable declared with length notation:
D APIRtn DS
D Pos145 145 152
D PackNbr 15P 5
OVERLAY(Pos145)

When defining overlapping fields, use the OVERLAY keyword instead of positional notation.
Keyword OVERLAY explicitly ties the declaration of a "child" variable to that of its "parent." Not only does OVERLAY document this relationship, but if the parent moves elsewhere within the program code, the child will follow.

If your program uses compile-time arrays, use the **CTDATA form to identify the compile-time data.
This form effectively documents the identity of the compile-time data, tying the data at the end of the program to the array declaration in the D-specs. The **CTDATA syntax also helps you avoid errors by eliminating the need to code compile-time data in the same order in which you declare multiple arrays.

Naming Conventions


Perhaps the most important aspect of programming style deals with the names you give to data items (e.g., variables, named constants) and routines.
When naming an item, be sure the name fully and accurately describes the item.
The name should be unambiguous, easy to read, and obvious. Although you should exploit RPG IV's allowance for long names, don't make your names too long to be useful. Name lengths of 10 to 14 characters are usually sufficient, and longer names may not be practical in many specifications. When naming a data item, describe the item; when naming a subroutine or procedure, use a verb/object syntax (similar to a CL command) to describe the process. Maintain a dictionary of names, verbs, and objects, and use the dictionary to standardize your naming conventions.

When coding an RPG symbolic name, use mixed case to clarify the named item's meaning and use.
RPG IV lets you type your source code in upper- and lowercase characters. Use this feature to clarify named data. For RPG-reserved words and operations, use all uppercase characters.

Avoid using special characters (e.g., @, #, $) when naming items.
Although RPG IV allows an underscore (_) within a name, you can easily avoid using this "noise" character if you use mixed case intelligently.
Indicators


Historically, indicators have been an identifying characteristic of the RPG syntax, but with RPG IV they are fast becoming relics of an earlier era. To be sure, some operations still require indicators, and indicators remain the only practical means of communicating conditions to DDS-defined displays. But reducing a program's use of indicators may well be the single most important thing you can do to improve the program's readability.

Use indicators as sparingly as possible; go out of your way to eliminate them.
In general, the only indicators present in a program should be resulting indicators for opcodes that absolutely require them (e.g., CHAIN before V4R2) or indicators used to communicate conditions such as display attributes to DDS-defined display files.

Whenever possible, use built-in functions (BIFs) instead of indicators.
As of V4R2, you can indicate file exception conditions with error- handling BIFs (e.g., %EOF, %ERROR, %FOUND) and an E operation extender to avoid using indicators.

If you must use indicators, name them.
V4R2 supports a Boolean data type (N) that serves the same purpose as an indicator. You can use the INDDS keyword with a display-file specification to associate a data structure with the indicators for a display or printer file; you can then assign meaningful names to the indicators.

Use an array-overlay technique to name indicators before V4R2.
Using RPG IV's pointer support, you can overlay the *IN internal indicator array with a data structure. Then you can specify meaningful subfield names for the indicators. This technique lessens your program's dependence on numeric indicators. For example:
D IndicatorPtr * INZ(%ADDR(*IN))
D DS BASED(IndicatorPtr)
D F03Key 3 3
D F05Key 5 5
D CustNotFnd 50 50
D SflClr 91 91
D SflDsp 92 92
D SflDspCtl 93 93
C IF F03Key = *ON
C EVAL *INLR = *ON
C RETURN
C ENDIF
Use the EVAL opcode with *Inxx and *ON or *OFF to set the state of indicators.
Do not use SETON or SETOFF, and never use MOVEA to manipulate multiple indicators at once.

Use indicators only in close proximity to the point where your program sets their condition.
For example, it's bad practice to have indicator 71 detect end- of-file in a READ operation and not reference *IN71 until several pages later. If it's not possible to keep the related actions (setting and testing the indicator) together, move the indicator value to a meaningful variable instead.

Don't use conditioning indicators - ever.
If a program must conditionally execute or avoid a block of source, explicitly code the condition with a structured comparison opcode, such as IF. If you're working with old S/36 code, get rid of the blocks of conditioning indicators in the source.

4.6 Include a description of any indicators you use.
It's especially important to document indicators whose purpose isn't obvious by reading the program, such as indicators used to communicate with display or printer files or the U1-U8 external indicators, if you must use them.

Structured Programming Techniques


Give those who follow you a fighting chance to understand how your program works by implementing structured programming techniques at all times.

Don't use GOTO, CABxx, or COMP.
Instead, substitute a structured alternative, such as nested IF statements, or status variables to skip code or to direct a program to a specific location. To compare two values, use the structured opcodes IF, ELSE, and ENDIF. To perform loops, use DO, DOU, and DOW with ENDDO. Never code your loops by comparing and branching with COMP (or even IF) and GOTO. Employ ITER to repeat a loop iteration, and use LEAVE for premature exits from loops.

Don't use obsolete IFxx, DOUxx, DOWxx, or WHxx opcodes.
The newer forms of these opcodes - IF, DOU, DOW, and WHEN - support free-format expressions, making those alternatives more readable. In general, if an opcode offers a free-format alternative, use it.

Perform multipath comparisons with SELECT/WHEN/OTHER/ENDSL.
Deeply nested IFxx/ELSE/ENDIF code blocks are hard to read and result in an unwieldy accumulation of ENDIFs at the end of the group. Don't use the obsolete CASxx opcode; instead, use the more versatile SELECT/WHEN/OTHER/ENDSL construction.

Always qualify END opcodes.
Use ENDIF, ENDDO, ENDSL, or ENDCS as applicable. This practice can be a great help in deciphering complex blocks of source.

5.5 Avoid programming tricks and hidden code.
Such maneuvers aren't so clever to someone who doesn't know the trick. If you think you must add comments to explain how a block of code works, consider rewriting the code to clarify its purpose. Use of the obscure "bit-twiddling" opcodes (BITON, BITOFF, MxxZO, TESTB, TESTZ) may be a sign that your source needs updating.

Modular Programming Techniques


The RPG IV syntax, along with the AS/400's Integrated Language Environment (ILE), encourages a modular approach to application programming. Modularity offers a way to organize an application, facilitate program maintenance, hide complex logic, and efficiently reuse code wherever it applies.

Use RPG IV's prototyping capabilities to define parameters and procedure interfaces.
Prototypes (PR definitions) offer many advantages when you're passing data between modules and programs. For example, they avoid runtime errors by giving the compiler the ability to check the data type and number of parameters. Prototypes also let you code literals and expressions as parameters, declare parameter lists (even the *ENTRY PLIST) in the D-specs, and pass parameters by value and by read-only reference, as well as by reference.

Store prototypes in /COPY members.
For each module, code a /COPY member containing the procedure prototype for each exported procedure in that module. Then, include a reference to that /COPY module in each module that refers to the procedures in the called module. This practice saves you from typing the prototypes each time you need them and reduces errors.

Include constant declarations for a module in the same /COPY member as the prototypes for that module.
If you then reference the /COPY member in any module that refers to the called module, you've effectively "globalized" the declaration of those constants.

Use IMPORT and EXPORT only for global data items.
The IMPORT and EXPORT keywords let you share data among the procedures in a program without explicitly passing the data as parameters - in other words, they provide a "hidden interface" between procedures. Limit use of these keywords to data items that are truly global in the program - usually values that are set once and then never changed.

Character String Manipulation


IBM has greatly enhanced RPG IV's ability to easily manipulate character strings. Many of the tricks you had to use with earlier versions of RPG are now obsolete. Modernize your source by exploiting these new features.

Use a named constant to declare a string constant instead of storing it in an array or table.
Declaring a string (such as a CL command string) as a named constant lets you refer to it directly instead of forcing you to refer to the string through its array name and index. Use a named constant to declare any value that you don't expect to change during program execution.

Avoid using arrays and data structures to manipulate character strings and text.
Use the new string manipulation opcodes and/or built-in functions instead.

Use EVAL's free-format assignment expressions whenever possible for string manipulation.
When used with character strings, EVAL is usually equivalent to a MOVEL(P) opcode. Use MOVE and MOVEL only when you don't want the result to be padded with blanks.

Avoid Obsolescence


RPG is an old language. After 30 years, many of its original, obsolete features are still available. Don't use them.
Don't sequence program line numbers in columns 1-5.
Chances are you'll never again drop that deck of punched cards, so the program sequence area is unnecessary. In RPG IV, the columns are commentary only. You may use them to identify changed lines in a program or structured indentation levels, but be aware that these columns may be subject to the same hazards as right-hand comments (see style guideline 1.5).

Avoid program-described files.
Instead, use externally defined files whenever possible.

If an opcode offers a free-format syntax, use it instead of the fixed- format version.
Opcodes to avoid include CABxx, CASxx, CAT, DOUxx, DOWxx, IFxx, and WHxx.

If a BIF offers the same function as an opcode, use the BIF instead of the opcode.
With some opcodes, you can substitute a built-in function for the opcode and use the function within an expression. At V4R1, the SCAN and SUBST opcodes have virtually equivalent built-in functions, %SCAN and %SUBST. In addition, you can usually substitute the concatenation operator (+) in combination with the %TRIMx BIFs in place of the CAT opcode. The free-format versions are preferable if they offer the same functionality as the opcodes.

Shun obsolete opcodes.
In addition to the opcodes mentioned earlier (style guidelines 5.2 and 5.3), some opcodes are no longer supported or have better alternatives.

CALL, CALLB.
The prototyped calls (CALLP or a function call) are just as efficient as CALL and CALLB and offer the advantages of prototyping and parameter passing by value. Neither CALL nor CALLB can accept a return value from a procedure.

DEBUG.

With OS/400's advanced debugging facilities, this opcode is no longer supported.
You should use display file I/O to display information or to acquire input.

FREE.
This opcode is no longer supported.

PARM, PLIST.
If you use prototyped calls, these opcodes are no longer necessary.

Miscellaneous Guidelines


Here's an assortment of other style guidelines that can help you improve your RPG IV code.

In all specifications that support keywords, observe a one-keyword-per-line limit.
Instead of spreading multiple keywords and values across the entire specification, your program will be easier to read and let you more easily add or delete specifications if you limit each line to one keyword, or at least to closely related keywords (e.g., DATFMT and TIMFMT).

Begin all H-spec keywords in column 8, leaving column 7 blank.
Separating the keyword from the required H in column 6 improves readability.

Relegate mysterious code to a well-documented, well-named procedure.
Despite your best efforts, on extremely rare occasions you simply will not be able to make the meaning of a chunk of code clear without extensive comments. By separating such heavily documented, well-tested code into a procedure, you'll save future maintenance programmers the trouble of deciphering and dealing with the code unnecessarily.
Final Advice


Sometimes good style and efficient runtime performance don't mix. Wherever you face a conflict between the two, choose good style. Hard-to-read programs are hard to debug, hard to maintain, and hard to get right. Program correctness must always win out over speed. Keep in mind these admonitions from Brian Kernighan and P.J. Plauger's The Elements of Programming Style:
Make it right before you make it faster.
Keep it right when you make it faster.
Make it clear before you make it faster.
Don't sacrifice clarity for small gains in efficiency.

No comments:

Post a Comment