Tuesday, May 11, 2021

Updating Sequential files using COBOL

There is no doubt Mainframes running COBOL powers majority of world's business transactions. Some of the firms are Financial institutions, hospitals, government and logistics.    

The very first site that I worked for is (even now) a global leader on the market of business information. They collect, store and process a business's information to generate credit scores and business information reports. The scores assess the business and it helps, say, a bank to use the information in the report when deciding to offer a loan to that business. 

Master file:

    I was part of the application which generated the scores. We stored the scores for ~80 million businesses and we didn't maintain a database. Rather, we used a sequential Master file which was inclined to grow whenever a new business's score was generated. 

In addition to that, it was necessary that we updated the scores of the existing businesses in daily basis as a business is prone to changes (there might be a change in CEO; the business might go Out of Business or Bankrupt; the business might win a Suit; trade payment changes undergone by the business and so on). 

The Master file had a key field (a unique number assigned for each business) which uniquely identified each record. 
All the records were in sequence by the key field. 
The Master file's width (LRECL) was large enough to accomodate every information collected about the business. 
There were coded fields (e.g., codes used for Bankruptcy status, Out of business status etc.) to save space.



Transaction file:

    Daily changes of the business were stored in a file referred to as transaction file. The transaction file had all transactions to be posted to the Master file that have occurred since the previous update. The transaction file also had a key field (the same key as that of the Master file) and all the records in the transaction file were in sequence by the key field.


Updating a Master file:

    The process of making the Master file current is referred to as updating. The Master file is updated via sequential processing by reading in the Master file along with the transaction file and creating a new master file. At the end of the update process, there will be an old master and a new master; should something happen to the new master file, it can be recreated from the old. Refer to the following picture for better clarity.

Click on the image for a larger version.


The Old Master file (OLD-MASTER) contains master information that was complete and current till the previous updating cycle. The transaction file (TRANS-FILE) contains transactions or changes that occurred since the previous updating cycle. These transactions or changes must be incorporated into the master file to make it current and updated. As a result, a New Master file (NEW-MASTER) will include all OLD-MASTER data in addition to the changes stored on the TRANS-FILE that have occurred since the last update. 

As all the records are in sequence by the key field, we compare the key field in the Old Master file to the same key field in the transaction file to determine if the master record is to be updated; this comparison requires both the files to be in sequence by the key field.  

Let's take a look at the format of the two input files:

OLD-MASTER 📂 

(in sequence by M-BUSINESS-NO)

COLS        FIELD

1-9             M-BUSINESS-NO

10-39      M-BUSINESS-NAME

40-42      M-SCORE

43-100       M-FILLER


TRANS-FILE 📂

(in sequence by T-BUSINESS-NO)

COLS        FIELD

1-9             T-BUSINESS-NO

10-39      T-BUSINESS-NAME

40-42      T-SCORE

43-100       T-FILLER


How input transaction and Master records are processed?

Once all the files are opened, a record is read from both the Old Master file and the transaction file. As the files are already in sequence by their respective key fields, a comparison of M-BUSINESS-NO and T-BUSINESS-NO should be made to determine the next set of actions. Three possible conditions are met when comparing M-BUSINESS-NO and T-BUSINESS-NO fields: 

IF T-BUSINESS-NO = M-BUSINESS-NO

If the business numbers are equal, this means that a transaction record exists with the same business number as that on the Master file. When this condition is met, the transaction data is posted to the master record. This means, the record which goes into the New Master file will contain the updated score and other fields from the transaction file.

Once the record is written, the next record is read from both the Old Master file and Transaction file. 

IF T-BUSINESS-NO > M-BUSINESS-NO 

If T-BUSINESS-NO IS > M-BUSINESS-NO, this means that M-BUSINESS-NO < T-BUSINESS-NO. In this case, there is a record in the Master file with a business number less than the business number on the transaction file. Since both the files are in sequence by the business number, this condition means that a master record exists for which there is no corresponding transaction record. This means, the record read from the master file hasn't gone through any changes during the current update cycle and should be written as it is onto the New Master file.

Once write is made to the New Master file, next record is read only from the Old Master File. We do not read another record from the Transaction file as we haven't processed the last transaction record that caused T-BUSINESS-NO to compare greater than M-BUSINESS-NO of the OLD-MASTER.

IF T-BUSINESS-NO < M-BUSINESS-NO

Since both the files are in sequence by business number, this condition would mean that a transaction record exists for which there is no corresponding record in the Master file. This could mean that the scores are generated for a new business (voila! 😃). In this instance, a new master record is created entirely from the transaction file and is written onto the New Master file. 

Once written, the next record is read only from the Transaction file. We do not read another record from the Old Master file since we haven't processed the Master record that compared greater than T-BUSINESS-NO

The following example illustrates the update procedure along with the corresponding action to be taken:



A sample update program is shown below: (Language - COBOL)

  ID DIVISION.                     
  PROGRAM-ID. CBL4.                  
  AUTHOR. SRINIVASAN.                 
 *                           
  ENVIRONMENT DIVISION.                
  INPUT-OUTPUT SECTION.                
  FILE-CONTROL.                    
    SELECT OLD-MASTER ASSIGN TO OLDMAST.       
    SELECT NEW-MASTER ASSIGN TO NEWMAST.       
    SELECT TRANS-FILE ASSIGN TO TRANS.        
 *                           
  DATA DIVISION.                    
  FILE SECTION.                    
  FD OLD-MASTER                    
    RECORDING MODE IS F               
    RECORD CONTAINS 100.               
  01 OLD-MASTER-REC.                  
   05 M-BUSINESS-NO        PIC X(9).     
   05 M-BUSINESS-NAME      PIC X(30).    
   05 M-SCORE              PIC 9(3).     
   05 M-FILLER             PIC X(58).    
  FD TRANS-FILE                    
    RECORDING MODE IS F               
    RECORD CONTAINS 100.               
  01 TRANS-REC.                    
   05 T-BUSINESS-NO        PIC X(9).     
   05 T-BUSINESS-NAME      PIC X(30).    
   05 T-SCORE              PIC 9(3).     
   05 T-FILLER             PIC X(58).    
  FD NEW-MASTER                    
    RECORDING MODE IS F               
    RECORD CONTAINS 100.               
  01 NEW-MASTER-REC.                  
   05 N-BUSINESS-NO        PIC X(9).     
   05 N-BUSINESS-NAME      PIC X(30).  
   05 N-SCORE              PIC 9(3).   
   05 N-FILLER             PIC X(58).  
 *                         
  PROCEDURE DIVISION.               
  100-MAIN-MODULE.                 
      PERFORM 800-INITIALIZATION-RTN        
      PERFORM 600-READ-MASTER           
      PERFORM 700-READ-TRANS            
      PERFORM 200-COMPARE-RTN           
        UNTIL M-BUSINESS-NO = HIGH-VALUES    
          AND T-BUSINESS-NO = HIGH-VALUES    
      PERFORM 900-CLOSE-FILES-RTN         
      STOP RUN.                  
 *                         
  200-COMPARE-RTN.                 
      EVALUATE TRUE                
      WHEN T-BUSINESS-NO = M-BUSINESS-NO      
           PERFORM 300-REGULAR-UPDATE       
      WHEN T-BUSINESS-NO < M-BUSINESS-NO      
           PERFORM 400-NEW-ACCOUNT         
      WHEN OTHER                  
           PERFORM 500-NO-UPDATE          
      END-EVALUATE.                
 *                         
  300-REGULAR-UPDATE.               
      MOVE OLD-MASTER-REC TO NEW-MASTER-REC    
      WRITE NEW-MASTER-REC             
      PERFORM 600-READ-MASTER           
      PERFORM 700-READ-TRANS.           
 *                         
  400-NEW-ACCOUNT.                 
      MOVE SPACES TO NEW-MASTER-REC        
      MOVE T-BUSINESS-NO TO N-BUSINESS-NO     
      MOVE T-BUSINESS-NAME TO N-BUSINESS-NAME   
      MOVE T-SCORE TO N-SCORE            
      MOVE T-FILLER TO N-FILLER           
      WRITE NEW-MASTER-REC              
      PERFORM 700-READ-TRANS.            
 *                          
  500-NO-UPDATE.                   
      WRITE NEW-MASTER-REC FROM OLD-MASTER-REC    
      PERFORM 600-READ-MASTER.            
 *                          
  600-READ-MASTER.                  
      READ OLD-MASTER                
      AT END MOVE HIGH-VALUES TO M-BUSINESS-NO    
      END-READ.                   
 *                          
  700-READ-TRANS.                  
      READ TRANS-FILE                
      AT END MOVE HIGH-VALUES TO T-BUSINESS-NO    
      END-READ.                   
 *                          
  800-INITIALIZATION-RTN.              
      OPEN INPUT OLD-MASTER             
                 TRANS-FILE             
          OUTPUT NEW-MASTER.             
 *                          
  900-CLOSE-FILES-RTN.                
      CLOSE OLD-MASTER                
            TRANS-FILE                
            NEW-MASTER.               
 *                          
Two  files (Old Master file and Transaction file) are passed as input to the COBOL program. The program creates the New Master file as output.  Contents of the files are shown below:

Old Master file:
Contents of Old Master file.


Transaction file:
Contents of Transaction file


JCL used to compile and run the load module:
First step of the JCL compiles the COBOL program. If the compilation is successful, the second step will run to execute the load. 


After submitting the JCL, the following output file is created. 

New Master file:
Contents of New Master file.

Note the new record with business number as 000000004 added to the New Master file. Also, the scores of the existing businesses are updated. 

Use of HIGH-VALUES for End of file conditions:

With 2 input files, it's very unlikely that both the files will reach AT END conditions at the same time. There are high chances that the transaction file will run out of records before the Old Master file. In such cases, the remaining records from the Old Master file must be written to the New Master file. 

The COBOL reserved keyword, HIGH-VALUES is moved to the business number fields when the Old Master file/Transaction file has reached its end. 

HIGH-VALUES refer to the largest value in the system's collating sequence. This is a character consisting of "all bits on" in a single storage position. All bits on in EBCDIC represents a nonstandard, nonprintable character used to specify the highest value in the system's collating sequence. 

When the Transaction file reaches the end, HIGH-VALUES are moved to T-BUSINESS-NO. This ensures that the subsequent attempt to compare the T-BUSINESS-NO and M-BUSINESS-NO will always result in a "greater than" condition i.e., there is a record in the Master file with a business number less than the business number on the transaction file. This means the record read from the master file hasn't gone through any changes during the current update cycle and should be written as it is onto the New Master file.

HIGH-VALUES is a figurative constant that may be used only with fields that are defined as alphanumeric. If numeric fields are used, then moving all 9s (999999999) to the key field will always compare higher than any other number. Beware; if a business number of 999999999 is a possible entry, then moving all 9s during end-of-file condition could produce error. 

We've hit the end-of-file condition for this blog post 😉

    In this post, we learnt about the procedure used for updating sequential files in COBOL. This procedure is also referred to as 'file-matching logic'. Hope it was useful. 

    In the next post, I'll try to implement the same stuff but in Python. Thanks for reading! Should you have any queries/suggestions, please post it in the Comments section below 👍.


References used for this post:
Structured COBOL Programming - 8th Edition - Stern/Stern.



2 comments: