Friday, February 26, 2021

How to sort on the bits of a byte using IBM DFSORT?

Recently, I came across a DFSORT coding challenge titled as "Odds & Evens".

The problem statement goes like this - Given a file with valid sequence numbers in columns 1 thru 6, sort the file so the corresponding output has all the even numbered records first, followed by all the odd numbered records.

I put on my thinking cap 🎩 for a while and came up with the following answer:

 ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--  
 ***************************** Top of Data ******************************  
 //Z01071A JOB 1,NOTIFY=&SYSUID                       
 //STEP01  EXEC PGM=SORT                           
 //SORTIN  DD *                               
 000001                                   
 000002                                   
 000003                                   
 000004                                   
 000005                                   
 000006                                   
 000007                                   
 000008                                   
 000009                                   
 000010                                   
 000011                                   
 000012                                   
 000013                                   
 000014                                   
 000015                                   
 000016                                   
 000017                                   
 000018                                   
 000019                                   
 000020                                   
 //SORTOUT DD SYSOUT=*                            
 //SYSOUT  DD SYSOUT=*                            
 //SYSIN   DD *                               
   INREC IFTHEN=(WHEN=GROUP,RECORDS=2,PUSH(10:SEQ=1))            
   SORT FIELDS=(10,1,CH,D,1,6,CH,A)                     
   OUTREC FIELDS=(1,6)                            
 /*                                     
I just formed a group of 2 records and PUSH'ed sequence numbers (of 1 byte) for each record of the group. As there are only 2 records in a group, the sequence number will be 1 for the first record and 2 for the second record. The sequence number will be restarted from 1 when a new group is started. 

Then, I used the sequence number field (at col 10) in the SORT statement to sort it in the descending order so that all the records with sequence number as 2 will be at the top.  A secondary sort was applied on the first 6 bytes. 

Submitting this job, I got the following output,
  COMMAND INPUT ===>                                            SCROLL ===> CSR   
 ********************************* TOP OF DATA **********************************  
 000002                                       
 000004                                       
 000006                                       
 000008                                       
 000010                                       
 000012                                       
 000014                                       
 000016                                       
 000018                                       
 000020                                       
 000001                                       
 000003                                       
 000005                                       
 000007                                       
 000009                                       
 000011                                       
 000013                                       
 000015                                       
 000017                                       
 000019                                       
 ******************************** BOTTOM OF DATA ********************************  
WHEN=GROUP is one amazing feature in DFSORT, thanks to Frank Yaeger from IBM DFSORT Development team, as he is one of the brains behind the invention of WHEN=GROUP.

We got the answer. Are we done here?

Nope, I'm just done with the Intro. 

The main reason behind the idea of writing this blog post was that when I was looking at other answers, I stumbled upon a solution which had a syntax that I've never seen before. It goes like this: 
SORT FIELDS=(6.7,0.1,BI,A),EQUALS
Most of us would use the SORT control statement to specify the control field based on which the sorting should take place. We provide,
  1. the position of the field within the record
  2. the length of the field (in bytes)
  3. the format of the data in control field
  4. the order in which field must be sorted (ascending or descending)
Let's take the first 2 items. The position of the field within the record is the byte positon relative to the beginning of the record. The length of the field is usually expressed in integer numbers of bytes. We deal with Bytes (and a pet lover has to deal with bites 🐢 sometimes).

Let's take a look under the hood πŸ”§


A byte consists of 2 nibbles and each nibble is 4 bits long. A bit is either 0 or 1.

IBM Mainframe uses the EBCDIC character encoding. Each character is represented by its 8 bit EBCDIC Code. When we turn on the Hex mode, we will be able to see an hex value for each byte. When the hex value of each byte is converted to binary, we'll get the corresponding bits. 

For example, 

SRINI becomes,
E2        D9        C9        D5        C9                  Hexadecimal
11100010  11011001  11001001  11010101  11001001    Binary

πŸ“£IBM DFSORT allows us to sort on the bits of a byte with "bytes.bits" notation. 

How to sort on the bits of a byte?

Now, we know that each character has got an 8 bit binary value, we can use the bytes.bits notation to sort using bits.
  • First, specify the byte location relative to the beginning of the record and follow it with a period.
  • Then, specify the bit location relative to the beginning of that byte. Remember that the first (high-order) bit of a byte is bit 0 (not bit 1); the remaining bits are numbered 1 through 7.
In SORT FIELDS=(6.7,0.1,BI,A),EQUALS statement,
6.7 - says that the starting postion is the last bit in byte 6. 
0.1 - says that the length is 1 bit. 
BI  - for Binary format as we want to sort on bits
A  - for Ascending order. 

But why 6.7 as the start position of the control field? 

That's because by looking at the 6th byte of every sequence number, we can say whether that's an even number or odd number.

Example:
000001 - πŸ‘€ -> that's an odd number
000002 - πŸ‘€ -> that's an even
000003 - πŸ‘€ -> that's an odd
000004 - πŸ‘€ -> that's an even
000005 - πŸ‘€ -> that's an odd
000006 - πŸ‘€ -> that's an even. I'm tiredπŸ˜‘
....
....
.... and so on.

Another significance is that for each even number, the Least significant bit (the last bit) is 0 and for each odd number, it's 1. 

Example:
1         EBCDIC character
F1        Hexadecimal  
11110001  Binary

2         EBCDIC character
F2        Hexadecimal
11110010  Binary

3         EBCDIC character   
F3        Hexadecimal
11110011  Binary

4         EBCDIC character
F4        Hexadecimal
11110100  Binary

Hence, if we sort the last bit of 6th byte in ascending order, we would get all the even numbered records first, followed by the odd numbered records. 

The EQUALS parameter is coded in the SORT statement to preserve the original sequene in the output. If EQUALS is not coded, then the output will have all the even numbered records first, followed by the odd numbered record but the even/odd numbered records will not be in sorted order.

Let's try running this SORT operation using Python 🐍


We can make use of the Python API's provided by ZOAU to run the SORT operation. Z Open Automation Utilities (abbreviated to ZOAU) lets you perform many tasks on z/OS without needing to get into JCL. IBM has developed a bridge between Python and z/OS by creating API's for Python which allow Pythonistas to access z/OS resources. 

Before we start, we need the following stuff to run the SORT operation from Python:
  1. VS Code with Zowe explorer and IBM Z Open Editor extensions.
  2. Access to Zowe explorer.
  3. Access to USS (Unix System Services). 
  4. Little bit of Python Skills.
Note: Access to Zowe explorer and USS can be obtained when you sign up for MTM2020.

First, we need to create a new file under your home directory (/z/zxxxxx) in Unix Sytem Services. Use the touch command to create a new file.

I created one using this command, touch run_sort.py. Then, I used the IBM Z Open Editor to write the following code inside this file.

 
I've used Trinket to embed the Python code in this blog post. Note that you may not be able to RUN πŸƒ this script as the ZOAU utilities for Python aren't available in Trinket.

Let's walk through the code.

Lines 1 thru 3: The required ZOAU libraries for Python are imported so that you can use them in your code. 

Lines 5 thru 9: Line #5 uses the os.getenv() method in Python with 'USER' as argument. As the operating system that Python is running under is z/OS, USERID variable is assigned with your TSO user ID. 
Lines 6 thru 9 has got 3 variables of string type to store the dataset names. 

Lines 11 thru 28: Lines 11 thru 28 mimics the functionality of IEFBR14 utility. These lines delete the datasets before creation. We make use of the zoautil_py.Datasets module which has got several dataset related functions like create, delete, exists and so on. 

Lines 30 thru 41: Writes data into the SORTIN and SYSIN datasets. 
Line #31 defines an empty list called num.This list is created to store the sequence numbers from 1 to 20. Read more about lists and how to access the elements in a list πŸ‘‰ here
Lines 34 and 35 creates sequence numbers from 1 to 20 with the help of for loop and range() function in Python. zfill() method is used to populate leading zeros. As zfill() method can be applied only on string data, the numbers are type converted to string using str() function (in line #35). All the sequence numbers are appended to the list, num. The list is then written to the SORTIN dataset.
Lines 40 and 41 writes the SORT statements to the SYSIN dataset. The write functionality is achieved via zoautil_py.Datasets module.

Lines 43 thru 53: Line 44 creates an empty list called dd_names to store the DD names that are needed for the SORT program to run. 
In Line 53, MVSCmd.execute API is called to run the program SORT with arguments MSGPRT=CRITICAL,LIST (which goes to the PARM parameter in EXEC statement) and the list of DDStatements created in lines 47 thru 50. When this instruction is executed, a job might be submitted on z/OS in the background. 

Line #55 checks the return code from MVSCmd.execute API call. If it's zero, a message is displayed in the terminal and the output dataset (SORTOUT) created from the previous execution is read. 

When the program is run with python3 run_sort.py command in terminal, we get the following output.

Note: Click on the picture to get an enlarged view. 

The records in the output dataset are displayed in the terminal after running the Python code. The even numbered records are at the top, followed by the odd numbered records.

The datasets that were created from Python can also be accessed from the terminal. The datasets are shown below:

The input dataset to the SORT program. The sequence numbers were generated in Python.


The SYSIN input to the SORT program. The SORT statements were written from Python.


The output dataset. 

We have reached the bottom of this post and we discussed about two things:
  1. How to sort on the bits of a byte using IBM DFSORT?
  2. How to perform DFSORT operation using Python? 
I hope the content in this post was helpful to you. Please post your questions/suggestions in the Comments section of this post. 

Thx and Happy Weekend!


Friday, February 19, 2021

How to pass more than 100 bytes of data from JCL to COBOL?

There is a famous interview question one can expect in COBOL interviews.

How many ways you can pass data from JCL to COBOL program? 

If your answer is 3 (via SYSIN, via PARM and File Input), this blog post is for you as it's time πŸ• to update your answer. Let's begin..

I have prepared the following COBOL Program:
 =COLS> ---1----+----2----+----3----+----4----+----5----+----6----+----7--     
 ****** ***************************** Top of Data ******************************  
 000100 ID DIVISION.                                
 000200 PROGRAM-ID. CBL3.                             
 000210 AUTHOR. SRINIVASAN.                            
 000220 DATA DIVISION.                               
 000230 WORKING-STORAGE SECTION.                          
 000240 01 WS-DISPLAY           PIC X(10).               
 000241 01 WS-LENGTH            PIC S9(4) SIGN LEADING         
 000242                   SEPARATE.                
 000250 LINKAGE SECTION.                              
 000260 01 WS-PARM-GROUP.                             
 000270   05 WS-PARM-LEN         PIC S9(4) COMP.             
 000280   05 WS-PARM-DATA         PIC X(100).               
 000290 PROCEDURE DIVISION USING WS-PARM-GROUP.                  
 000300   ADD WS-PARM-LEN TO ZERO GIVING WS-LENGTH.               
 000400   MOVE WS-PARM-DATA(91:10) TO WS-DISPLAY.                
 000410   DISPLAY 'PARM LEN  :' WS-LENGTH.                   
 000420   DISPLAY 'WS-DISPLAY :' WS-DISPLAY.                   
 000500   STOP RUN.                               
 ****** **************************** Bottom of Data ****************************  
Pretty simple program. I've defined 2 data items in the LINKAGE SECTION to handle the PARM data passed from JCL to COBOL program. The first data item (WS-PARM-LEN) holds the length of the data passed from the JCL and the second data item (WS-PARM-DATA) holds the data itself. 

In the PROCEDURE DIVISION, I'm simply displaying the length of data passed and a portion of the data i.e., 10 bytes starting from 91st position. 

One important thing that you should note here in the program is the order of the data items defined in the LINKAGE SECTION. The data field is always preceeded by a two-byte length field defined in binary format. 

Let's take a look at the JCL now. 
 =COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--  
 ****** ***************************** Top of Data ******************************  
 000001 //Z01071C  JOB 1,NOTIFY=&SYSUID                       
 000002 //***************************************************/           
 000003 //* COBOL COMPILE AND LINK EDIT                       
 000004 //COBRUN  EXEC IGYWCL                            
 000005 //COBOL.SYSIN  DD DSN=&SYSUID..PDS(CBL3),DISP=SHR              
 000006 //LKED.SYSLMOD DD DSN=&SYSUID..LOAD(CBL3),DISP=SHR             
 000007 //***************************************************/           
 000008 // IF RC = 0 THEN                              
 000009 //***************************************************/           
 000010 //RUN     EXEC PGM=CBL3,PARM='12345678901234567890123456789012345678901   
 000011 //             23456789012345678901234567890123456789012345678901234567   
 000012 //             890'                             
 000013 //STEPLIB   DD DSN=&SYSUID..LOAD,DISP=SHR                  
 000014 //SYSOUT    DD SYSOUT=*,OUTLIM=15000                    
 000015 //CEEDUMP   DD DUMMY                            
 000016 //SYSUDUMP  DD DUMMY                            
 000017 //***************************************************/           
 000018 // ELSE                                   
 000019 // ENDIF                                  
 ****** **************************** Bottom of Data ****************************  
The first step is a PROC which is for COBOL program compilation and Link Edit. The second step (RUN πŸƒ) will run if the return code from the first step is 0. 

In the second step, PARM parameter is used (in line #10) to pass 100 bytes of data. Note that we can type the PARM data till 71st postion in the JCL and if it is to be continued on the next line, we can start anywhere from column 4 thru 16. 

After submitting this job, we get the following output written in SYSOUT
 ********************************* TOP OF DATA **********************************  
 PARM LEN   :+0100                                  
 WS-DISPLAY :1234567890                               
 ******************************** BOTTOM OF DATA ********************************  

Now, let's see what happens if we try to pass PARM data with 101 bytes (I've added one additional byte in line #12). 
 =COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--  
 ****** ***************************** Top of Data ******************************  
 000001 //Z01071C  JOB 1,NOTIFY=&SYSUID                       
 000002 //***************************************************/           
 000003 //* COBOL COMPILE AND LINK EDIT                       
 000004 //COBRUN  EXEC IGYWCL                            
 000005 //COBOL.SYSIN  DD DSN=&SYSUID..PDS(CBL3),DISP=SHR              
 000006 //LKED.SYSLMOD DD DSN=&SYSUID..LOAD(CBL3),DISP=SHR             
 000007 //***************************************************/           
 000008 // IF RC = 0 THEN                              
 000009 //***************************************************/           
 000010 //RUN     EXEC PGM=CBL3,PARM='12345678901234567890123456789012345678901   
 000011 //             23456789012345678901234567890123456789012345678901234567   
 000012 //             8901'                             
 000013 //STEPLIB   DD DSN=&SYSUID..LOAD,DISP=SHR                  
 000014 //SYSOUT    DD SYSOUT=*,OUTLIM=15000                    
 000015 //CEEDUMP   DD DUMMY                            
 000016 //SYSUDUMP  DD DUMMY                            
 000017 //***************************************************/           
 000018 // ELSE                                   
 000019 // ENDIF                                  
 ****** **************************** Bottom of Data ****************************  
After submitting the job, it is found that the job has failed with JCL error.
  
IEF642I EXCESSIVE PARAMETER LENGTH IN THE PARM FIELD

The maximum number of bytes that we can pass from JCL to COBOL, using PARM parameter, is 100.

How to pass more than 100 bytes of data from JCL to COBOL? πŸ€”




We can make use of PARMDD parameter to pass more than 100 bytes of data from JCL to COBOL. The best way to understand PARMDD is by looking at an example.
Note that PARMDD and PARM parameters are mutually exclusive.
I've just modified the previous JCL by replacing the PARM parameter with PARMDD. There are no changes made to the COBOL program and it remain as it is. Hence, RESTART=RUN is coded in the JOB statement.   
 ****** ***************************** Top of Data ******************************  
 000001 //Z01071C  JOB 1,NOTIFY=&SYSUID,RESTART=RUN                 
 000002 //***************************************************/           
 000003 //* COBOL COMPILE AND LINK EDIT                       
 000004 //COBRUN  EXEC IGYWCL                            
 000005 //COBOL.SYSIN DD DSN=&SYSUID..PDS(CBL3),DISP=SHR              
 000006 //LKED.SYSLMOD DD DSN=&SYSUID..LOAD(CBL3),DISP=SHR             
 000007 //***************************************************/           
 000008 // IF RC = 0 THEN                              
 000009 //***************************************************/           
 000010 //RUN     EXEC PGM=CBL3,PARMDD=MYDD                     
 000011 //STEPLIB   DD DSN=&SYSUID..LOAD,DISP=SHR                  
 000012 //MYDD      DD DISP=SHR,DSN=Z01071.PARMDD.INPUT.PS             
 000013 //SYSOUT    DD SYSOUT=*,OUTLIM=15000                    
 000014 //CEEDUMP   DD DUMMY                            
 000015 //SYSUDUMP  DD DUMMY                            
 000016 //***************************************************/           
 000017 // ELSE                                   
 000018 // ENDIF                                  
 ****** **************************** Bottom of Data ****************************  
Make a note of the things that are in bold. The PARMDD parameter must be used in conjunction with a DD statement. 

Here, the PARMDD keyword specifies a DD name, MYDD, which is then coded on a DD statement (in line #12) that specifies a dataset, Z01071.PARMDD.INPUT.PS, whose record length is 130. There is one record in this dataset which is 125 bytes long. 

After submitting this JCL, we get the following output. 
 ********************************* TOP OF DATA **********************************  
 PARM LEN  :+0125                                  
 WS-DISPLAY :1234567890                               
 ******************************** BOTTOM OF DATA ********************************  

The COBOL program doesn't have any File definitions. Rather, it contain instructions to handle information retrieved from PARM. We are making use of the same instructions/defintions (coded within the LINKAGE SECTION) for PARMDD as well. 
PARMDD is different from File input way of passing data. Both involve datasets but the definitions coded within the program to handle the file differs.
We are passing 125 bytes of data with PARMDD but the WS-PARM-DATA field in the program is declared only with 100 bytes. So, the last 25 bytes will be truncated. Hence, it is imperative that we code proper definitions in the program to handle data passed by PARMDD parameter.  

We've reached the bottom of the post. So, if someone asks you how many ways you can pass data from JCL to COBOL, answer them 4
  1. File Input
  2. via SYSIN
  3. via PARM
  4. and via PARMDD
Hope this helps. Should have any questions/suggestions, please post it in the Comments section of this post. Thx.