Tuesday, May 11, 2021

Updating Sequential files using COBOL

There is no doubt Mainframes running COBOL powers majority of world's business transactions. Some of the firms are Financial institutions, hospitals, government and logistics.    

The very first site that I worked for is (even now) a global leader on the market of business information. They collect, store and process a business's information to generate credit scores and business information reports. The scores assess the business and it helps, say, a bank to use the information in the report when deciding to offer a loan to that business. 

Master file:

    I was part of the application which generated the scores. We stored the scores for ~80 million businesses and we didn't maintain a database. Rather, we used a sequential Master file which was inclined to grow whenever a new business's score was generated. 

In addition to that, it was necessary that we updated the scores of the existing businesses in daily basis as a business is prone to changes (there might be a change in CEO; the business might go Out of Business or Bankrupt; the business might win a Suit; trade payment changes undergone by the business and so on). 

The Master file had a key field (a unique number assigned for each business) which uniquely identified each record. 
All the records were in sequence by the key field. 
The Master file's width (LRECL) was large enough to accomodate every information collected about the business. 
There were coded fields (e.g., codes used for Bankruptcy status, Out of business status etc.) to save space.



Transaction file:

    Daily changes of the business were stored in a file referred to as transaction file. The transaction file had all transactions to be posted to the Master file that have occurred since the previous update. The transaction file also had a key field (the same key as that of the Master file) and all the records in the transaction file were in sequence by the key field.


Updating a Master file:

    The process of making the Master file current is referred to as updating. The Master file is updated via sequential processing by reading in the Master file along with the transaction file and creating a new master file. At the end of the update process, there will be an old master and a new master; should something happen to the new master file, it can be recreated from the old. Refer to the following picture for better clarity.

Click on the image for a larger version.


The Old Master file (OLD-MASTER) contains master information that was complete and current till the previous updating cycle. The transaction file (TRANS-FILE) contains transactions or changes that occurred since the previous updating cycle. These transactions or changes must be incorporated into the master file to make it current and updated. As a result, a New Master file (NEW-MASTER) will include all OLD-MASTER data in addition to the changes stored on the TRANS-FILE that have occurred since the last update. 

As all the records are in sequence by the key field, we compare the key field in the Old Master file to the same key field in the transaction file to determine if the master record is to be updated; this comparison requires both the files to be in sequence by the key field.  

Let's take a look at the format of the two input files:

OLD-MASTER 📂 

(in sequence by M-BUSINESS-NO)

COLS        FIELD

1-9             M-BUSINESS-NO

10-39      M-BUSINESS-NAME

40-42      M-SCORE

43-100       M-FILLER


TRANS-FILE 📂

(in sequence by T-BUSINESS-NO)

COLS        FIELD

1-9             T-BUSINESS-NO

10-39      T-BUSINESS-NAME

40-42      T-SCORE

43-100       T-FILLER


How input transaction and Master records are processed?

Once all the files are opened, a record is read from both the Old Master file and the transaction file. As the files are already in sequence by their respective key fields, a comparison of M-BUSINESS-NO and T-BUSINESS-NO should be made to determine the next set of actions. Three possible conditions are met when comparing M-BUSINESS-NO and T-BUSINESS-NO fields: 

IF T-BUSINESS-NO = M-BUSINESS-NO

If the business numbers are equal, this means that a transaction record exists with the same business number as that on the Master file. When this condition is met, the transaction data is posted to the master record. This means, the record which goes into the New Master file will contain the updated score and other fields from the transaction file.

Once the record is written, the next record is read from both the Old Master file and Transaction file. 

IF T-BUSINESS-NO > M-BUSINESS-NO 

If T-BUSINESS-NO IS > M-BUSINESS-NO, this means that M-BUSINESS-NO < T-BUSINESS-NO. In this case, there is a record in the Master file with a business number less than the business number on the transaction file. Since both the files are in sequence by the business number, this condition means that a master record exists for which there is no corresponding transaction record. This means, the record read from the master file hasn't gone through any changes during the current update cycle and should be written as it is onto the New Master file.

Once write is made to the New Master file, next record is read only from the Old Master File. We do not read another record from the Transaction file as we haven't processed the last transaction record that caused T-BUSINESS-NO to compare greater than M-BUSINESS-NO of the OLD-MASTER.

IF T-BUSINESS-NO < M-BUSINESS-NO

Since both the files are in sequence by business number, this condition would mean that a transaction record exists for which there is no corresponding record in the Master file. This could mean that the scores are generated for a new business (voila! 😃). In this instance, a new master record is created entirely from the transaction file and is written onto the New Master file. 

Once written, the next record is read only from the Transaction file. We do not read another record from the Old Master file since we haven't processed the Master record that compared greater than T-BUSINESS-NO

The following example illustrates the update procedure along with the corresponding action to be taken:



A sample update program is shown below: (Language - COBOL)

  ID DIVISION.                     
  PROGRAM-ID. CBL4.                  
  AUTHOR. SRINIVASAN.                 
 *                           
  ENVIRONMENT DIVISION.                
  INPUT-OUTPUT SECTION.                
  FILE-CONTROL.                    
    SELECT OLD-MASTER ASSIGN TO OLDMAST.       
    SELECT NEW-MASTER ASSIGN TO NEWMAST.       
    SELECT TRANS-FILE ASSIGN TO TRANS.        
 *                           
  DATA DIVISION.                    
  FILE SECTION.                    
  FD OLD-MASTER                    
    RECORDING MODE IS F               
    RECORD CONTAINS 100.               
  01 OLD-MASTER-REC.                  
   05 M-BUSINESS-NO        PIC X(9).     
   05 M-BUSINESS-NAME      PIC X(30).    
   05 M-SCORE              PIC 9(3).     
   05 M-FILLER             PIC X(58).    
  FD TRANS-FILE                    
    RECORDING MODE IS F               
    RECORD CONTAINS 100.               
  01 TRANS-REC.                    
   05 T-BUSINESS-NO        PIC X(9).     
   05 T-BUSINESS-NAME      PIC X(30).    
   05 T-SCORE              PIC 9(3).     
   05 T-FILLER             PIC X(58).    
  FD NEW-MASTER                    
    RECORDING MODE IS F               
    RECORD CONTAINS 100.               
  01 NEW-MASTER-REC.                  
   05 N-BUSINESS-NO        PIC X(9).     
   05 N-BUSINESS-NAME      PIC X(30).  
   05 N-SCORE              PIC 9(3).   
   05 N-FILLER             PIC X(58).  
 *                         
  PROCEDURE DIVISION.               
  100-MAIN-MODULE.                 
      PERFORM 800-INITIALIZATION-RTN        
      PERFORM 600-READ-MASTER           
      PERFORM 700-READ-TRANS            
      PERFORM 200-COMPARE-RTN           
        UNTIL M-BUSINESS-NO = HIGH-VALUES    
          AND T-BUSINESS-NO = HIGH-VALUES    
      PERFORM 900-CLOSE-FILES-RTN         
      STOP RUN.                  
 *                         
  200-COMPARE-RTN.                 
      EVALUATE TRUE                
      WHEN T-BUSINESS-NO = M-BUSINESS-NO      
           PERFORM 300-REGULAR-UPDATE       
      WHEN T-BUSINESS-NO < M-BUSINESS-NO      
           PERFORM 400-NEW-ACCOUNT         
      WHEN OTHER                  
           PERFORM 500-NO-UPDATE          
      END-EVALUATE.                
 *                         
  300-REGULAR-UPDATE.               
      MOVE OLD-MASTER-REC TO NEW-MASTER-REC    
      WRITE NEW-MASTER-REC             
      PERFORM 600-READ-MASTER           
      PERFORM 700-READ-TRANS.           
 *                         
  400-NEW-ACCOUNT.                 
      MOVE SPACES TO NEW-MASTER-REC        
      MOVE T-BUSINESS-NO TO N-BUSINESS-NO     
      MOVE T-BUSINESS-NAME TO N-BUSINESS-NAME   
      MOVE T-SCORE TO N-SCORE            
      MOVE T-FILLER TO N-FILLER           
      WRITE NEW-MASTER-REC              
      PERFORM 700-READ-TRANS.            
 *                          
  500-NO-UPDATE.                   
      WRITE NEW-MASTER-REC FROM OLD-MASTER-REC    
      PERFORM 600-READ-MASTER.            
 *                          
  600-READ-MASTER.                  
      READ OLD-MASTER                
      AT END MOVE HIGH-VALUES TO M-BUSINESS-NO    
      END-READ.                   
 *                          
  700-READ-TRANS.                  
      READ TRANS-FILE                
      AT END MOVE HIGH-VALUES TO T-BUSINESS-NO    
      END-READ.                   
 *                          
  800-INITIALIZATION-RTN.              
      OPEN INPUT OLD-MASTER             
                 TRANS-FILE             
          OUTPUT NEW-MASTER.             
 *                          
  900-CLOSE-FILES-RTN.                
      CLOSE OLD-MASTER                
            TRANS-FILE                
            NEW-MASTER.               
 *                          
Two  files (Old Master file and Transaction file) are passed as input to the COBOL program. The program creates the New Master file as output.  Contents of the files are shown below:

Old Master file:
Contents of Old Master file.


Transaction file:
Contents of Transaction file


JCL used to compile and run the load module:
First step of the JCL compiles the COBOL program. If the compilation is successful, the second step will run to execute the load. 


After submitting the JCL, the following output file is created. 

New Master file:
Contents of New Master file.

Note the new record with business number as 000000004 added to the New Master file. Also, the scores of the existing businesses are updated. 

Use of HIGH-VALUES for End of file conditions:

With 2 input files, it's very unlikely that both the files will reach AT END conditions at the same time. There are high chances that the transaction file will run out of records before the Old Master file. In such cases, the remaining records from the Old Master file must be written to the New Master file. 

The COBOL reserved keyword, HIGH-VALUES is moved to the business number fields when the Old Master file/Transaction file has reached its end. 

HIGH-VALUES refer to the largest value in the system's collating sequence. This is a character consisting of "all bits on" in a single storage position. All bits on in EBCDIC represents a nonstandard, nonprintable character used to specify the highest value in the system's collating sequence. 

When the Transaction file reaches the end, HIGH-VALUES are moved to T-BUSINESS-NO. This ensures that the subsequent attempt to compare the T-BUSINESS-NO and M-BUSINESS-NO will always result in a "greater than" condition i.e., there is a record in the Master file with a business number less than the business number on the transaction file. This means the record read from the master file hasn't gone through any changes during the current update cycle and should be written as it is onto the New Master file.

HIGH-VALUES is a figurative constant that may be used only with fields that are defined as alphanumeric. If numeric fields are used, then moving all 9s (999999999) to the key field will always compare higher than any other number. Beware; if a business number of 999999999 is a possible entry, then moving all 9s during end-of-file condition could produce error. 

We've hit the end-of-file condition for this blog post 😉

    In this post, we learnt about the procedure used for updating sequential files in COBOL. This procedure is also referred to as 'file-matching logic'. Hope it was useful. 

    In the next post, I'll try to implement the same stuff but in Python. Thanks for reading! Should you have any queries/suggestions, please post it in the Comments section below 👍.


References used for this post:
Structured COBOL Programming - 8th Edition - Stern/Stern.



Wednesday, May 5, 2021

Using process statements in SuperCE utility

One of the item that I always strike off ✅ from my checklist whenever I'm assigned with a task of modifying an existing code is Source Code Comparison. It allows me to highlight the difference between different versions of the code. It also acts as a proof for the reviewer that only the intended parts of the code were modified. 

Although, CA Endevor lets us use the Changes (C) option to look at the actual lines we've changed, I rely upon SuperCE utility (option 3.13) to compare the modified code and the existing version of the code in Production environment. 

Welcome to my blog! 😀 In this blog post, we will look at the SuperCE (option 3.13) ISPF option - which is used to compare the content of two datasets - and the usage of Process statements which is similar to the usage of control statements in IBM's DFSORT utility. 

This one is for the Thumbnail 😁

Your time is precious. So, please use the following links to navigate to different sections of this post. 


Intro

SuperC (I guess the suffix 'C' after Super stands for Compare) is the standard option to compare two datasets of unlimited size and record length. SuperCE is the extended version of the standard SuperC Utility and it offers more flexibility like,

  • Comparing the datasets in line, word or byte level, 
  • Supplying process statements for specific compare requirements 
  • Various listing types and so on. 

How to access SuperCE Utility and use it?

To access the SuperCE Utility from ISPF Primary Option Menu, type 3 (Utilities) and press Enter.

Click on the image for a larger version.

ISPF Primary Option Menu.


From the Utility Selection Panel, type 13 (SuperCE)  and press Enter

Selecting SuperCE from Utility Selection Panel.


Voila! 👏
SuperCE Utility Panel.

Alternatively, you can type =3.13 from ISPF Primary Option Menu (or command line for that matter) and hit Enter to directly get into SuperCE Utility panel.
 

How to use SuperCE Utility?

Now that we're inside the SuperCE Utility panel, let's use it. 
The true method of knowledge is experiment. 
 - William Blake
To use SuperCE utility, we should have two datasets. It can be a sequential dataset, PDS or a member inside a PDS. ❗ SuperC and SuperCE doesn't support tape datasets. 

I've got 2 PDS members with a simple COBOL program in each of them. For better understanding, I've named these members as NEW and OLD because the contents in the NEW member is an updated version of contents in OLD member.

The NEW member. This COBOL program accepts a name from the user and displays the name with a greet. 


The OLD member. As you probably know, this COBOL program simply displays a very famous message to the user. 


The next step is to input these datasets in the SuperCE Utility panel and do the comparison. The New DS Name field should be provided with the updated version of the dataset that you want to compare and the Old DS Name field should be provided with the previous version of the dataset. 

Using the SuperCE Utility panel.

Whenever you access the SuperCE Utility panel, it provides default setting for the Compare Type, Listing Type, Listing DSN, and Browse option. 

SuperCE Utility works the best for you with the following settings,
  • Compare Type - Line (Compares the dataset for line differences)
  • Listing Type - Delta (SuperCE provides a listing after the comparison. This listing shows some awesome stats. Delta option lists the differences between the source data sets, followed by the general summary)
  • Listing DSN - This is where the listing output will be stored. SuperCE allocates a default DSN in case if you leave this field blank. If you want to store the results of comparison (I do, as I used to pass on this dataset to my code reviewer), you may provide your own DSN.  
  • Display Output - Yes (This option tells ISPF that you want the output listing to be displayed. If you choose the option No, SuperCE will not show the listing but it shows the result of the comparison (Differences found or No differences found) at the top right corner of the panel). 
  • Output Mode - View or Browse 
  • Execution Mode - Foreground is the default. 
For more details about the SuperCE Panel Fields, click 👉 here.  

Let's hit Enter to allow SuperCE perform the comparison. The listing output after the comparison is shown below.
 
Listing output for Line Compare. 

In the Listing Output Section (Line #4 thru 21), the source lines are shown. 

Left side of each line is either marked with I (Insert) or D (Delete). 

The first source line at line #9, 000200 PROGRAM-ID. NEW.  , is marked with I (Insert) i.e., the listing tells that this line was inserted in the New DSN and wasn't found in the Old DSN. 

The next source line at line #10 is marked with D (Delete) i.e., the listing tells that this line is present in Old DSN but not in the New DSN. So, it must have been deleted in the updated version of the code. 

The Line Compare Summary and Statistics section at the bottom shows the overall summary of the comparison. 

How to use process statements to perform diverse data comparisons?

As you would've noticed in the listing output, the first 6 bytes (Column Numbers) of the COBOL code was also included for the comparison by SuperCE. 

Suppose you want to compare data residing in the columns 7 thru 72 in both the datasets, you should supply process statements for this requirement. 

The process statements panel can be accessed by typing E in the command line of SuperCE Utility panel, or by using Options action bar choice and choosing Option 1 - Edit Statements.

Accessing Process Statements panel.


In the following picture, some examples of the statements that can be used are shown in the bottom half of the screen. The actual statements required for your comparison should be typed in the EDIT window shown in the first half of the screen. 

Process Statements panel.


CMPCOLM process statement should be used to compare using a column range.

Inputting Process Statements. 

We can exit the screen now by pressing F3. A message, 'Statements DS saved' is displayed at the top right corner of the SuperCE Utility panel. 

Statements DS Saved.


The compare statements will be stored in the dataset provided in Statements DSN field in SuperCE Utility panel. This field can also be left blank allowing the system to create one dataset for you to store the process statements. 

On hitting Enter, the compare request will be invoked with the process options.
 
Listing Output

The Line Compare Summary shows that there are 4 line matches and 6 differences. At the bottom of the screen, the criteria used for this compare task is specified. 

There are many flavours of process statements that can be invoked depending on what you need to compare. Some of them are listed below. 

Example 1:



You can notice that the end of the process statement, CMPCOLM, contains a suffix of N and O, indicating that it is referencing the New DSN and Old DSN respectively. What follows the statement is the column range within the referenced dataset. 

With these statements, we tell SuperCE that we want to compare the data residing in columns 5 to 30 in the New DSN with data in columns 1 to 25 in the Old DSN. 

Example 2:
Suppose you want to ignore the comment lines in your COBOL code from being compared. 



DPLINE (Do not process lines) process statement do not process the lines that can be recognized by a unique character string, for comparison. 

DPLINE '*',7 scans for an asterisk ('*') in column 7 and ignores it from being compared.


Example 3:
Suppose if you want to compare only specific rows in each datasets.


The NFOCUS and OFOCUS process statements can be used to specify the rows to be used for the comparison. In this case, rows 1 thru 10 will be used from the New DSN while rows 11 thru 21 will be used from the Old DSN. 

More about Process Statements can be found 👉 here

Running SuperCE in batch mode

Sit back and relax. You can create a JCL from SuperCE Utility panel (with fewer hits on that Enter button) to run the comparison in batch mode. ISRSUPC is the program which is used for comparison.

After providing the datasets in the New and Old DSN, select the execution mode as Batch and press Enter. In the Submit Batch jobs panel, Job statement info is provided at the bottom of the screen. I've chose to Edit JCL before submit. 

SuperC Utility - Submit Batch jobs panel.


Upon hitting Enter, the JCL is shown to user. 



If you are adding Process Statements, a SYSIN DD statement will be added to the JCL. 



Conclusion

Hope you witnessed the uses of SuperCE utility. If SuperCE stands for Super Compare Extended, then adjective Super is well suited and appropriate. Should you have any questions/suggestions please leave it in the comments section below. Thx 👍


References: 
  • z/OS ISPF User's Guide Vol II
  • TSO/ISPF Curriculum z/OS v2.3 - Interskill Learning