Sunday, January 31, 2021

How to convert rows to columns in a dataset using IBM DFSORT?

Welcome! In this blog post I am writing about another scenario. Are you ready?

One of the site that I worked for dealt with Invoices, approvals and payments. There was a workflow which usually began from the point where the system received invoices in bulk. The system will then review and approve the supplier invoices. Upon approval, the supplier was paid.

The application was geographically spread out in 3 regions of the globe and the workflow was common across the regions.

There was a requirement to create a report capturing the monthly statistics on invoices, approvals and payments across the 3 regions and send it over email to appropriate stakeholders.

Invoices, approvals and payments' data were stored in different DB2 tables. Hence, 3 SQL SELECT queries were combined using UNION ALL to pull the necessary details. The output from SQL had the following data:

INVOICES|+00099999

APPROVALS|+00012345

PAYMENTS|+00003456

It was a pipe delimited file, so if you imported the file to Microsoft Excel with pipe delimiter, you'll have 2 columns and 3 rows as shown below.

Since we had to pull this data for 3 regions, there were 3 output datasets with the same record structure.

The next step was to create a report out of these files. The expected final output was,

The final report. Rows are converted to columns. There is an header and trailer record too.

We used ICETOOL ⛄ to create the report

There are a lot of stuff which we put together with ICETOOL to accomplish the output. Let's look at them one by one.

ICETOOL is a multipurpose DFSORT utility that uses the capabilities of DFSORT to perform multiple operations on one or more datasets in a single step.

Step 1:

Each input dataset was a pipe delimited file. The 3 SQL output datasets were concatenated and passed as one input to the ICETOOL step. There were 2 columns (fields) in each record of the dataset. But the columns didn't start in fixed position. For example, the count for Invoices would start at 10th position whereas for Approvals, it was at 11th position. So, we had to make the columns start at fixed positions and we used the PARSE operand in DFSORT to do that.

As we knew each output dataset from SQL will have only 3 records and they all look similar (except for the variation in count values), WHEN=GROUP in DFSORT was used to create 3 groups. Each group was assigned a unique ID and each record in the group was assigned with sequence numbers.

WHEN=GROUP is used to introduce a group of records to DFSORT.

Well, you may ask why we should do this? 🤔 With sequence numbers and ID's for the group, we can manipulate/arrange the records in the way we wish.

At this stage, the output looked like below:

 =COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--  
 ****** ***************************** Top of Data ******************************  
 000001 INVOICES  +000999991 1                           
 000002 APPROVALS +000123451 2                           
 000003 PAYMENTS  +000034561 3                           
 000004 INVOICES  +000000192 1                           
 000005 APPROVALS +000000162 2                           
 000006 PAYMENTS  +000000152 3                           
 000007 INVOICES  +001000503 1                           
 000008 APPROVALS +000123613 2                           
 000009 PAYMENTS  +000034713 3                           
 ****** **************************** Bottom of Data ****************************

Following points are to be noted from the output:

1. The fields were in fixed positions now💪.

2. There were 3 groups: each group was assigned with a unique ID in position 21. Each record in a group was assigned with sequence numbers in position 23.

Let's see how we converted the fields from rows to column?

Consider the first 3 records (first group).

 =COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--  
 ****** ***************************** Top of Data ******************************  
 000001 INVOICES  +000999991 1                           
 000002 APPROVALS +000123451 2                           
 000003 PAYMENTS  +000034561 3

Note: Ignore the string literals (the first 10 bytes) and the (+) symbol in 12th position (as the counts can't go beyond zero in our case) for the remainder of this blog post.

The count values (in position 13 thru 20) are placed row wise, one after the another. If we sum these count values, we'll get 115800.

Right! What if we had the count values in the following way? The count values are still placed row wise, one after the other, but they are in different positions in each record. And, there are lot of zeros🤔.

 =COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--  
 ****** ***************************** Top of Data ******************************  
 000001 00099999 00000000 00000000                         
 000002 00000000 00012345 00000000                          
 000003 00000000 00000000 00003456

Now, if we sum each of the columns (we have got 3 columns here) we would get the following output.

 =COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--  
 ****** ***************************** Top of Data ******************************  
 000001 00099999 00012345 00003456

You just witnessed how we converted the rows to columns 👏. Awesome? All thanks goes to those zeros 👍 in each record which helped us consolidate 3 records to one. Those zeros acted as placeholders.

We implemented this idea using DFSORT. Instead of zeros, we inserted Binary Zeros (Hex Value - 00) to the input records in the following way:

The group's first record was populated in first 8 bytes. All other positions were filled with Binary zeros.
The group's second record was populated in positions 10 thru 17. All other positions were filled with Binary zeros.
The group's third record was populated in positions 19 thru 26. All other positions were filled with Binary zeros.

A group's first, second and third records were identified with the help of sequence numbers that we populated for each group (in 23rd position) with the help of PUSH parameter.

Binary zeros can be used in DFSORT as placeholders so that it can be filled with data at a later point of time.

Let's look at the code now.

INREC PARSE=(%01=(ENDBEFR=C'|',FIXLEN=10),%02=(ENDBEFR=C'|',

FIXLEN=9)),BUILD=(%01,12:%02)

SORT FIELDS=COPY

OUTREC IFTHEN=(WHEN=GROUP,RECORDS=3,PUSH(21:ID=1,23:SEQ=1)),

IFTHEN=(WHEN=(23,1,CH,EQ,C'1'),BUILD=(1:13,8,18Z,70:21,1,

80:X)),

IFTHEN=(WHEN=(23,1,CH,EQ,C'2'),BUILD=(9Z,10:13,8,18:8Z,

70:21,1,80:X)),

IFTHEN=(WHEN=(23,1,CH,EQ,C'3'),BUILD=(18Z,19:13,8,70:21,1,

80:X))

What does this code do?

INREC PARSE=(%01=(ENDBEFR=C'|',FIXLEN=10),%02=(ENDBEFR=C'|',

FIXLEN=9)),BUILD=(%01,12:%02)

ENDBEFR and FIXLEN parameters of PARSE operand are used to define the rules for extracting vaariable length data to %nn fixed parsed fields. ENDBEFR stops extracting data at the byte before the specified string. FIXLEN is used to specify the length of the fixed area to contain the extracted data. %01 is the fixed parsed field which holds the string literal such as INVOICES, APPROVALS and PAYMENTS. %02 is the fixed parsed field which holds the count value. The BUILD statement is used in INREC PARSE to build the record with fixed parsed fields.

SORT FIELDS=COPY - SORT statement with COPY option.

OUTREC IFTHEN=(WHEN=GROUP,RECORDS=3,PUSH(21:ID=1,23:SEQ=1)),

IFTHEN=(WHEN=(23,1,CH,EQ,C'1'),BUILD=(1:13,8,18Z,70:21,1,

80:X)),

IFTHEN=(WHEN=(23,1,CH,EQ,C'2'),BUILD=(9Z,10:13,8,18:8Z,

70:21,1,80:X)),

IFTHEN=(WHEN=(23,1,CH,EQ,C'3'),BUILD=(18Z,19:13,8,70:21,1,

80:X))

There are 4 IFTHEN...WHEN conditions coded in OUTREC. The first condition assigns a unique ID for each set of 3 records and sequence number for each record in the group. PUSH parameter helps in doing that. A group is identified with WHEN=GROUP and RECORDS=3 parameters

2nd, 3rd and 4th IFTHEN...WHEN conditions reformat the input records which has got sequence numbers as 1, 2 and 3 respectively. Binary zeros are inserted in the BUILD statement. For example, 18Z means insert 18 Binary zeros.

After running these SORT statements, we got the following output:

 =COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--  
 ****** ***************************** Top of Data ******************************  
 000001 00099999                                                             1   
 000002          00012345                                                    1   
 000003                   00003456                                           1   
 000004 00000019                                                             2   
 000005          00000016                                                    2   
 000006                   00000015                                           2   
 000007 00100050                                                             3   
 000008          00012361                                                    3   
 000009                   00003471                                           3   
 ****** **************************** Bottom of Data ****************************

Turning the HEX mode ON, you'll be able to see the binary zeros (X'00') spread out over each record.

 =COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--  
 000001 00099999                                                             1   
        FFFFFFFF0000000000000000004444444444444444444444444444444444444444444F44  
        000999990000000000000000000000000000000000000000000000000000000000000100  
 ------------------------------------------------------------------------------   
 000002          00012345                                                    1   
        000000000FFFFFFFF0000000044444444444444444444444444444444444444444444F44  
        000000000000123450000000000000000000000000000000000000000000000000000100  
 ------------------------------------------------------------------------------   
 000003                   00003456                                           1   
        000000000000000000FFFFFFFF4444444444444444444444444444444444444444444F44  
        000000000000000000000034560000000000000000000000000000000000000000000100  
 ------------------------------------------------------------------------------   
 000004 00000019                                                             2   
        FFFFFFFF0000000000000000004444444444444444444444444444444444444444444F44  
        000000190000000000000000000000000000000000000000000000000000000000000200  
 ------------------------------------------------------------------------------   
 000005          00000016                                                    2   
        000000000FFFFFFFF0000000044444444444444444444444444444444444444444444F44  
        000000000000000160000000000000000000000000000000000000000000000000000200  
 ------------------------------------------------------------------------------   
 000006                   00000015                                           2   
        000000000000000000FFFFFFFF4444444444444444444444444444444444444444444F44  
        000000000000000000000000150000000000000000000000000000000000000000000200  
 ------------------------------------------------------------------------------   
 000007 00100050                                                             3   
        FFFFFFFF0000000000000000004444444444444444444444444444444444444444444F44  
        001000500000000000000000000000000000000000000000000000000000000000000300  
 ------------------------------------------------------------------------------   
 000008          00012361                                                    3   
        000000000FFFFFFFF0000000044444444444444444444444444444444444444444444F44  
        000000000000123610000000000000000000000000000000000000000000000000000300  
 ------------------------------------------------------------------------------   
 000009                   00003471                                           3   
        000000000000000000FFFFFFFF4444444444444444444444444444444444444444444F44  
        000000000000000000000034710000000000000000000000000000000000000000000300  
 ------------------------------------------------------------------------------

It's time 🕒 to sum the records to one, thereby converting the rows to column.

Using SUM FIELDS with Binary Zeros has got a perk⭐

Hex representation of the first count value (00099999) is F0F0F0F9F9F9F9F9.

Hexadecimal representation of EBCDIC displayable characters.

The hex value of a binary zero is 00.

When the count value is added with binary zeros (in BI format),

F0F0F0F9F9F9F9F9

+0000000000000000

----------------

F0F0F0F9F9F9F9F9 → 00099999

We get the same value 😀.

Let's take another example. The Hex value of my name (in upper case) is E2D9C9D5C9E5C1E2C1D5

Hexadecimal representation of EBCDIC displayable characters.

When my name, a string, is added with Binary zeros (in BI format), I get my name back.

E2D9C9D5C9E5C1E2C1D5

+00000000000000000000

--------------------

E2D9C9D5C9E5C1E2C1D5 → SRINIVASAN

The takeaway is - with Binary zeros, we can even use SUM FIELDS on strings 😎.

Step 2:

We summed the count values on their Group ID. This means,

The count value in the first record (in positions 1 thru 8) of each group was summed up with first 8 bytes of binary zeros in records 2 and 3 of the group.
The count value in the second record (in positions 10 thru 17) of each group was summed up with 8 bytes of binary zeros (in positions 10 thru 17) in records 1 and 3 of the group.
The count value in the third record (in positions 19 thru 26) of each group was summed up with 8 bytes of binary zeros (in positions 19 thru 26) in records 1 and 2 of the group.

Let's look at the code.

SORT FIELDS=(70,1,CH,A)

SUM FIELDS=(1,8,BI,10,8,BI,19,8,BI)

OUTREC FIELDS=(1,8,X,10,8,X,19,8,X,C'1',71:SEQNUM,1,ZD,80:X)

If a field is in BI format, it can be 2, 4 or 8 bytes long. Here, the count value and binary zeros are 8 bytes long, hence we SUM fields on 8 bytes of BI.

OUTREC FIELDS is used to format the summed output fields.

We got the following output.

 =COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--  
 ****** ***************************** Top of Data ******************************  
 000001 00099999 00012345 0000346 1                     1   
 000002 00000019 00000016 0000002 1                     2   
 000003 00100050 00012361 0000348 1                     3   
 ****** **************************** Bottom of Data ****************************

If you compare this output with one that we've got after using PARSE operand,

 =COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--  
 ****** ***************************** Top of Data ******************************  
 000001 INVOICES  +000999991 1                           
 000002 APPROVALS +000123451 2                           
 000003 PAYMENTS  +000034561 3                           
 000004 INVOICES  +000000192 1                           
 000005 APPROVALS +000000162 2                           
 000006 PAYMENTS  +000000152 3                           
 000007 INVOICES  +001000503 1                           
 000008 APPROVALS +000123613 2                           
 000009 PAYMENTS  +000034713 3                           
 ****** **************************** Bottom of Data ****************************

you'll notice that the rows are converted to columns now.

That's it. I hope you now understand the usage of binary zeros. In the next post, I will continue with the rest of the code that we prepared to meet the requirements.

Should you have any questions/suggestions, please use the Comments section below. Thx.

Sunday, January 24, 2021

How to exclude a group of records using IBM DFSORT?

One of the site I worked for, dealt with Invoices. We used Electronic Data Interchange (EDI) to exchange trading information in a standard format between suppliers/customers.

EDI eliminates a lot of manual intervention by the transmission of electronic messages between computer systems. Do not confuse the term electronic message with an e-mail message. An email message 📧 is usually not processed by the receiving system whereas, EDI messages are intended to undergo automatic processing in the receving system.

There is a specific format in EDI to transmit invoice data and it is EDI 810. It is an electronic version of an invoice that contains the information requied for a usual paper-invoice purchase transaction.

Use of EDI to transmit data can shorten the lead time between invoice receipt and fulfillment of orders. EDI is often referred as 'paperless trading'.

The EDI 810 documents which were received from the site's customers were translated to a proprietary file format as required by the backend application. The translated file usually had a lot of batches and within each batch there were a lot of invoices. An Invoice would have detail record(s) within it. Refer the following pictures for a better understanding on the file's structure.

Note: Click on the picture to get an enlarged view.

Translated EDI 810 file with lot of batches in it.

A batch from the translated EDI 810 file with many invoices; Each invoice with many detail records.

Each batch would have a batch header and batch trailer record. Each invoice within a batch would have an invoice header and an invoice trailer record. Between an invoice header and trailer records, would be the invoice detail records.

The format of the file had a record type field in each record to identify whether that record is a batch header, a batch trailer, an Invoice header, an Invoice detail, an Invoice trailer and so on.

This translated file was fed as input to a COBOL DB2 program. During the program's execution, each Invoice and its detail record(s) were fed to an internal table with maximum limit hardcoded as 99,999. Sometimes, an invoice's detail records would exceed 99,999 thereby making the program to fail with ABEND SOC4. The abend occurred so frequently that we had to implement a permanent fix.

We decided to remove the entire batch which had invoice(s), whose detail records exceeded 99,999.

Experts suggested to write a COBOL program to exclude the batch, but I had other ideas (DFSORT 😎)

If that title sounds familiar, all thanks goes to Mc Dowell's advertisement which featured MS Dhoni 🏏.

The approach we adopted was to uniquely number each record in the following way:

Assign a unique ID, 9 bytes long, to each batch in the file.
Assign a unique ID, 9 bytes long, to each Invoice that is part of the batch
Assign sequence numbers, 9 bytes long, for each record that is part of the Invoice.

We used the numbered file to check if the invoice sequence number exceeded 99,999. If true, the batch ID of that invoice was written to an intermediate file.

JOINKEYS application in DFSORT was used to match the numbered file and the intermediate file to create 2 output files; BYPASS file that had the records of the batch which had to be bypassed (as the invoice detail records exceeded 99,999) and FINAL file that had the records of all other batches (whose invoice detail records doesn't exceed 99,999).

Note: For better understanding, I've created an input file which is somewhat similar to what I had at my site. I'll be using that input file in an example to show how we can exclude a group of records from the file. The ID's and sequence numbers are only 5 bytes long in the example, as opposed to the 9 bytes that we used at our site. Also, in the example we will exclude the batch which holds an invoice with detail records exceeding 20 as opposed to 99,999.

1st byte of the input file is the Record type field.
0 - Batch header record.
1 - Invoice header record.
8 - Invoice trailer record.
9 - Batch trailer record.
Other record types are out of scope for the problem that we are dealing.

The input file with 2 batches. Each batch has got an invoice. The 2 batch's invoice data records exceed 20.

Step 1:

The first step is to assign,

a unique ID for each batch,
a unique ID for each invoice within that batch,
sequence numbers for all the records between an Invoice header and Invoice trailer record.

To do this, we will use DFSORT's IFTHEN WHEN=GROUP feature, with BEGIN and PUSH parameters.

First step. Assigning unique IDs.

INREC in DFSORT is used to format fields before sorting.

First INREC IFTHEN condition assigns a unique ID for each batch:

WHEN=GROUP - Used to introduce a group of records to DFSORT.

BEGIN=(1,1,CH,EQ,C'0') - Each '0' in the first byte indicates the start of a new batch.

PUSH(41:ID=5) - Assigns an identifier, 5 bytes long starting from 41st position, for each batch. The identifier is +1'd when a new batch is started.

Second INREC IFTHEN condition assigns a unique ID for each invoice within a batch and sequence of numbers for each records that are part of the invoice.

WHEN=GROUP - Used to introduce a group of records to DFSORT.

BEGIN=(1,1,CH,EQ,C'1') - Each '1' in the first byte indicates a new invoice within that batch.

PUSH(46:ID=5,51:SEQ=5) - Assigns an identifier, 5 bytes long starting from 46th position, for each Invoice. The identifier is +1'd when a new invoice is started. Also, assigns sequence number, 5 bytes long starting from 51st position. The sequence number will be restarted from 1 whenever a new invoice header record is encountered.

OUTREC in DFSORT is used to format fields after sorting.

The OUTREC IFTHEN conditions given at the last are used to populate zeros, in the place of Invoice ID and Invoice sequence number, in the Batch header and trailer records.

Running the first step will give the following output.

This is how the numbered file looks like.
Col 41 thru 45 contains the Batch ID.
Col 46 thru 50 contains the Invoice ID.
Col 51 thru 55 contains the Invoice Sequence number.

Step 2:

The second step is to find that batch ID whose invoice sequence number exceeds 20. Output file created from step 1 is passed as input to this step. It's evident from the input file that Batch 2 contains an Invoice which has got more than 20 detail records.

INCLUDE COND=(51,5,ZD,EQ,21) - Only those records whose sequence number equals 21 is included for the processing.

SORT FIELDS=(41,5,CH,A) - SORT the file based on the batch ID.

SUM FIELDS=NONE - Remove duplicates. Sometimes, there can be multiple invoices within a batch with sequence number exceeding 20. In such cases, Batch ID might be written more than once. Hence, we are removing duplicates.

OUTREC FIELDS=(1:41,5,80:X) - Write the Batch ID to an intermediate output file.

We get the following output after running Step 2:

Batch ID whose invoice number is exceeding 20, is written to the output file.

Step 3:

The final step is to use JOINKEYS application in DFSORT to write 2 output files. One with the excluded batch and the other one with batch that should be sent as input to COBOL DB2 program.

Input files to be used:

The output file from Step 2 (the one contains the batch ID which should be excluded).

The output file from Step 1 (the numbered file).

Note: The order of the input files in JCL should be in such a way that the file with no duplicates should be provided first.

JOINKEYS FILES=F1,FIELDS=(1,5,A) - Defines the key field from the first file with which the matching should occur.

JOINKEYS FILES=F1,FIELDS=(41,5,A) - Defines the key field from the second file with which the matching should occur.

REFORMAT FIELDS=(F2:1,55,F1:1,5,?) - The REFORMAT statement indicates the fields from File 1 and File 2 that we should include in the joined records. '?' symbol is used as an indicator with the following possible values:

"B" - indicates that it is a paired record.

"1" - indicates that it is an unpaired record from File 1.

"2" - indicates that it is an unpaired record from File 2.

JOIN UNPAIRED,F2 - This statement is similar to RIGHT JOIN in DB2. The joined records will contain both the paired and unpaired records from File 2.

SORT FIELDS=COPY - Sort statement.

OUTFIL FNAMES=BYPASS,BUILD=(1,55),INCLUDE=(61,1,CH,EQ,C'B') - The output file with BYPASS as the DD name in the JCL will be written with the paired records i.e, all the records in the input file 2 which have 00002 as the Batch ID.

OUTFIL FNAMES=FINAL,BUILD=(1,40),INCLUDE=(61,1,CH,EQ,C'2') - The output file with FINAL as the DD name in the JCL will be written with the unpaired records from File 2 i.e., the records whose batch ID is not 00002.

After running Step 3, we get the following output files.

The BYPASS file that contains all the records with Batch ID 00002.

The FINAL file which will be fed as input to the COBOL DB2 program.

At our site, we used the BYPASS file in subsequent steps to extract crucial information such as the Batch number, Invoice number and so on, in order to send an email to business with details about the batches that were omitted for the day.

That's it. As always, thanks much for reading. Have any questions or suggestions? Please post them in the comments section below. Thx!

P.S. Screen grab of Mainframe screens throughout this post were taken from Master the Mainframe system.

Wednesday, January 13, 2021

How to send job completion message to multiple TSO ID's?

Hey!

Let's see how we can make use of NOTIFY parameter and NOTIFY statement in JCL to send job completion message to multiple TSO user ID's.

I believe 🤞 most of you reading this post would be aware of NOTIFY parameter used in the JOB statement.

//Z01071A JOB 1,NOTIFY=&SYSUID

It sends TSO message to the user to notify the success or failure of the job. The user's ID is automatically detected by the JCL, thanks to &SYSUID. SYSUID is a system symbol in JCL. During the run time of the JCL, &SYSUID will be substituted with the user ID under whose authority the job had been submitted.

We can also hardcode the user ID instead of providing &SYSUID.

What if we wanted to send the TSO messages to multiple user ID's?

We can use NOTIFY statement in JCL to send the TSO message 💬 to multiple user ID's.

The NOTIFY statement is supported by JES2 only, beginning in z/OS V2R3.

JCL with NOTIFY statements and USER parameter.

In the picture shown above, 2 NOTIFY statements are being used right after the JOB statement.

The USER parameter is used to identify the TSO user ID to whom the job completion status message will be sent.

The TYPE=MSG indicates that the notification should be sent as a TSO message and MSG is the default when the USER parameter is specified.

A maximum of 8 NOTIFY statements can be specified in a JCL.

You can also use the NOTIFY statement to send an email 📧, to the specified email ID, about the job's completion status.

JCL with NOTIFY statement and EMAIL parameter.

When you are using EMAIL parameter, you can't use the USER parameter as EMAIL and USER parameters are mutually exclusive.

Which of the 2 TYPE parameters work well?

TYPE=USER works like a charm and will notify the specified user ID's by sending a TSO message. The job gets through with EMAIL parameter but I haven't seen an email in my Inbox 😕. I've tried the EMAIL parameter at my site and in the MTM2020 system but in vain. I wish I could look at the email that's been triggered from the NOTIFY statement.

We've hit the bottom of this post...

NOTIFY statements will be of great help when another team has to submit their job(s) based on the completion of job(s) that you or your team submit. You may simply include the participant(s) User ID/email so that they get notified about your job's completion status.

Over to you now. Try the NOTIFY statement at your site. ⚠ If you want to use the EMAIL parameter, ensure that you're providing your Work email ID.

Hope this helps. Thanks for reading. Hit a Like if this post was useful. Share your questions/suggestions in the Comments section below.

Monday, January 11, 2021

How to schedule jobs using JOBGROUP?

I was on Production support for that week and (Nope! I wasn't haunted by a ghost 👻) I was given a task of running a specific SQL SELECT query against a DB2 table in Production environment, 2 times a day, throughout that week. If the query returned rows, I had to prepare a one time job to be run in Prod. Else, do nothing.

The project had only scheduled releases due to Agile way of working and the one time job that we ran on Prod environment was a temporary fix that we had to do until the next release, which was to occur in 2 months down the line.

Therefore, it was imperative to run that SQL SELECT query 2 times a day by the person who is on support for that week and as we all know, Human error is inevitable - there were some misses in running the query. I already knew about JOBGROUP by then. So, I was thinking 🤔 about applying it in this scenario to prevent any misses.

Thumbnail 😉

The main reason that JOBGROUP struck my mind was due to the fact that we had to run the SQL query at specified times.

What's JOBGROUP 🤔?

The JOBGROUP statement in JCL allows you to group several jobs together thereby allowing you to schedule them and trigger them based on various events. JOBGROUP gives you just some basic job scheduling capabilities and it is not there to replace high-end batch scheduling softwares like Ctrl-M or CA-7.

JOBGROUP is available from z/OS Version 2.2

JOBGROUP definition starts with a JOBGROUP statement and ends with an ENDGROUP statement. Within this boundary, we have to define the job names of all the jobs that are being grouped together. We should also specify when each job should run.

Note: Click on the picture to get an enlarged view.

Note that the JOBGROUP is given a name (Z01071GP) and an ENDGROUP statement is used to define the end of definition.

Like the JOB statement, positional parameters like accounting information, programmer's name can also be specified in the JOBGROUP statement. Additionally, we can provide 2 optional parameters in the JOBGROUP statement. Those are as follows:

ERROR=(condition) - This parameter will define the error condition that the JOBGROUP needs to evaluate after the completion of each job in the group.

The syntax of condition remains the same as used for the IF statement in JCL. In the picture shown above, ERROR=(ABEND) is being used and ABEND indicates that an ABEND condition occurred.

ONERROR=STOP|SUSPEND|FLUSH - This parameter will define the action that needs to be taken when an error occurs.

When things are put together, ERROR=(ABEND),ONERROR=STOP instructs the JOBGROUP to evaluate each job's output after its completion; if there's an ABEND, STOP submitting new jobs from that JOBGROUP. Though, jobs that are actively running are allowed to complete.

An example showing the usage of JOBGROUP:

Let's see how we can define the job names inside a JOBGROUP.

Defining job names in a JOBGROUP

Each job name within a JOBGROUP is identified by GJOB statement. Job names should be unique. AFTER is positioned after a GJOB statement and it defines the dependency to another job. WHEN defines a condition associated with the dependency.

In the picture shown above, the jobs are defined in such a way that Z01071A job runs first; Z01071B runs after Z01071A and only WHEN the return code of Z01071A is 0. Likewise, Z01071C job should run after Z01071B and only when the RC of Z01071B is 0.

Like AFTER statement, BEFORE and CONCURRENT can be specified to define the order in which the jobs can run.

The next step is to add the jobs to the JCL. Yup! You need to have the JOBGROUP definition and the jobs that are part of the JOBGROUP, in the same JCL.

The complete JCL with JOBGROUP definition and its jobs.

As you can see from the picture above, all the jobs that are defined within a JOBGROUP is added after the ENDGROUP statement. Do you notice a SCHEDULE statement included below each JOB statement? This is used to assign the scheduling attributes and in this case, it is assigning the job to your group, Z01071GP. Also note that I've used IDCAMS to set the MAXCC to 16 in the second job. This is done deliberately and the reason for doing that, you'll witness shortly. It's time ⏰ to watch things in action. Let's see the outcome of this JCL after submission.

The JOBGROUP and its jobs are submitted.

Tada! 🧞 Just ignore the order the messages are being displayed. Z01071A and Z01071B jobs ran (one after another) and completed with MAXCC 0 👏. But, what happened to the third job? Z01071C job was flushed as the return code from Z01071B was not zero 😑. As per the definition in the JOBGROUP, Z01071C should run only when Z01071B job issues a return code of 0.

I hope you are now better off with JOBGROUP. Hang on and read a little bit as I'm about to unearth another useful feature of JOBGROUP i.e., the HOLDUNTL parameter.

How I used JOBGROUP at my site to prevent misses in running the SQL query?

The requirement was to run the SQL SELECT query 2 times in a day through out that week. I defined a JOBGROUP with 10 jobs in it. Each job had the following steps:

Step 1: a DB2 UNLOAD step with the SQL SELECT query which would write the rows to an output dataset.

Step 2: IDCAMS step to check for emptiness of the output file. RC=0 if the file is not empty. Else, RC=4.

Step 3/4: Email step runs based on the return code from Step 2. If RC=0, an email is sent from Mainframe to folks who are on support for that week, notifying that there are rows to update in prod.

All the 10 jobs were scheduled to run at a pre-defined time. Let me show you how we can do that.

The SCHEDULE statement of the first job now have the HOLDUNTL parameter with which we can set a job to specify the date and time until which the job should be held.

After submitting the JOBGROUP, the status of the first job, Z01071A shows HOLD, in spool.

If you traverse across columns from left to right, you'll find a column 'HoldUntil' which holds the Date and time we passed in the JCL.

The job are exactly fired at the specified date and time.

More about HOLDUNTL parameter and its syntax can be found here.

In my scenario, I added HOLDUNTL parameter in all the 10 jobs, with the date and time spanning across that whole week.

Now, folks who were on support only had to monitor their inbox and when they received an email notification, saying there are rows on Prod table that needs a fix, they had to take an action to run the one-time job on Prod. JCL's JOBGROUP feature was counted on running the SQL query without a miss 😃.

All of these might sound like automating stuff ⛭. True, but only up to certain extent. We still needed manual intervention to run the one-time prod job with help from another team.

Something is better than nothing!

Thanks for reading! Should you have any questions or suggestions, please drop them in the Comments section below.

References: IBM Knowledge Center

Screenshot courtesy: Screenshots were taken from MTM2020 system.

Saturday, January 9, 2021

Writing a REXX Program to create a Valid Luhn Number generator | Master the Mainframe 2020 | Level 3.3: REXX2: Nexxt Level Rexx

Welcome to my blog! This is my first post of this new year, 2021. Happy New Year to one and all 😄

In this post, we'll look at Level 3.3 of Master the Mainframe 2020. If you haven't registered for MTM2020 yet, then you're missing the fun 😐. Master the Mainframe has been a lot interesting this year (2020) as we are using Visual Studio Code (for the first time) to establish a connection with z/OS. Hit this link to register.

Level 3.3: REXX2: Nexxt Level Rexx is all about building complex logic and functionality using REXX. The task is to write a REXX Exec that acts as a generator, which when executed, will output Luhn-Algorithm compatible 16 digit numbers. The code shouldn't take any parameters though.

If you do not have prior experience in writing REXX exec's, do not worry at all.

MTM2020 is the right place to get your hands on REXX. There are couple of challenges in Level 3 which will improve your REXX skills in the best way possible.
Advertisement alert ⚠: I also recommend you to read a blog post of mine 😀 which is intended for beginners in REXX.
Jim Barry's REXX tutorial is my personal favourite, so I recommend that as well.

It's time ⏰ to go further...

The first step in the challenge instructions will ask you to copy a member named CCVIEW from MTM2020.PUBLIC.SOURCE to your own SOURCE dataset. Let's take a look at the REXX Exec residing in CCVIEW member, section by section.

In a nutshell, this REXX Exec reads an input file, MTM2020.PUBLIC.CUST16, line by line over a do while loop until end of the file and writes each line to an output file, if the read line is longer than 10 characters. We should be aware of few commands in REXX before venturing into a REXX Exec that is dealing with files (like this one 👆).

First things first, the dataset(s) that you're going to use in a REXX Exec should be allocated to the address space that the REXX is running under. This is usually done using the ALLOC command as shown in the lines 11 and 12. The ALLOC command can be used by a REXX program to dynamically allocate necessary datasets.

Defining ALLOC command for a dataset in REXX Exec is very much similar to defining a dataset in a JCL DD statement for I/O.

Let's go thru the command given in line 11.

"ALLOC FI(indd) DA('MTM2020.PUBLIC.CUST16') SHR REUSE"

FI refers to the name that will be used by other commands in this REXX Exec, like EXECIO, to refer to this file (Kinda shorthand 😉). DA refers to the real dataset name. The disposition SHR indicates that the file is intended for reading. It is imperative to note that any dataset allocated within a REXX Program using the ALLOC command should be released back to the system using the FREE command, after the file has been used. More about ALLOC can be found 👉 here.

After the dataset is allocated, we can read the file by the EXECIO command using the DISKR parameter. Let's look at line 23.

"EXECIO 0 DISKR indd (OPEN"

The EXECIO command is used for processing files. 0 along with DISKR indicates that 0 records should be read from the file referenced by indd DD name.

R in the DISKR is for Read. Similarly, W in the DISKW is for write. X in the DISKX will issue an error because there is no such thing as DISKX 😅. Just DISKW and DISKR.

(OPEN indicates that the file should be opened for future use. Another example of EXECIO command can be found in line 31.

"EXECIO 1 DISKR indd"

This EXECIO command resides within a DO loop ➿. Under each iteration, this EXECIO reads 1 record from indd. The records read will be written to the stack on a first-in, first-out basis . The records are then available by using the PULL or PARSE PULL instructions.

Lines 30-45 are already explained in the challenge instructions. Shamelessly copying it here 😒.

There are three nested DO statetments here. The outermost loop is just a check to make sure there are more lines to read, the middle loop iterates through those lines, and the innermost loop checks if the length of the line is greater than 10 characters, before writing it to the output file.

Make note of line 41 which is commented out.

call INSPECT

The call instruction calls 📞 an internal routine named as INSPECT. Routines are made up of sequence of instructions that can receive data, process it, and return a value. INSPECT label with a colon can be found at line 80. When REXX executes the call instruction at line 41, the control passes to line 80. The instructions under the INSPECT routine are then executed, in this case, just a SAY instruction, which outputs a message to user. RETURN statement at line 82 returns back the control to the CALL instruction.

Both the input and output files are closed in lines 46 and 57 respectively. FINIS in the command indicates that the dataset should be closed after use. If FINIS is omitted, the datset will remain open for future use.

The datasets are freed by FREE instruction in lines 76 and 77 🆓.

When you run this REXX Exec after replacing ZXXXXX with your user ID in line 12, you will notice that the output file looks the same as the input file. You now have a code that runs and it's time to build upon that foundation to make it do something else.

What's the ask?

Your first task is to add a logic to this program that checks whether the credit card number being read in satisfies the Luhn algorithm. You may probably have the logic for validation coded inside the INSPECT routine.

Once you have the logic for validating input for a Luhn number, the final task of this challenge is to build a generator that outputs Luhn-algorithm compatible 16 digit numbers. The final task should not read any input files or parameters. When the code is executed, it should write 500 unique numbers, one on each line, to a PDS member (ZXXXXX.OUTPUT(CUST16)). You have to use the ALLOC command to allocate the PDS member.

Reading the challenge instructions carefully plays a vital role in finishing this challenge successfully. You have got several hints 💡 in there and let me list them out here.

REXX is a dynamically typed language. This means you are able to store several values of different types in a single variable during your code execution and no errors will occur.

For example, consider the following 2 lines.

odd_digits = 0

odd_digits = odd_digits + substr(cc_digits,5,1)

In the second line, REXX actually understands the data by its usage. It automatically converts the type of the output from the second operand (substr() function) to perform the arithmetic operation.

Step 9 in the challenge instructions list out some built-in functions which you may find useful to accomplish the task.

SUBSTR – Substring – returns just the characters at a specific location within a string. There’s an example of this in action at line 40.
LENGTH – Returns the total length, in characters, of a string.
MATH OPERATORS – In particular, the // symbol, which returns the remainder after dividing by a number.
RANDOM – is a built-in function in REXX and it is used to generate a random non-negative whole number between the min and max range that are provided as arguments.

What was my approach to finish this challenge 🤔?

I came across the Credit card anatomy which goes like below:

Perceive lines 4 thru 16 as a Credit Card 💳. The first number is Major Industry Identifier (MII) which tells you what sort of institution issued the card. Lines 21 thru 27 shows the list of institutions.

The first six digits are the Issuer Identification Number (IIN). These can be used to look up where the card originated from. Lines 31 thru 35 might ring some bells 🔔.

The 7th digit to second-to-last digit is the customer account number. Most companies use just 9 digits for the account number, thereby making the Credit card number as 16 digits. But, it's possible to use up to 12 digits for the account number.

The last digit of a credit card is the check digit or checksum.

With this info in hand, I generated the random credit card number in the following way:

/* Major Industry Identifier */

random1 = random(1,9)

/* random1 to random3 together generates BIN */

random2 = random(111,999)

random3 = random(10,99)

/* random4 to random6 together generates account identifier */

random4 = random(1,9)

random5 = random(1111,9999)

random6 = random(1111,9999)

part2 = random1||random2||random3||random4||random5||random6

cc_digits = '000' || substr(part2,1,15)

Note that I had generated only 15 digits of the credit card number using random built-in function (exclude the first 3 digits, 000). The last digit, checksum is calculated using a formula,

c = (10 − (s mod 10) mod 10)

where,

c is checksum,

s is (sum of digits placed in odd positions) + (sum of each digit placed in even positions multiplied by 2)

Note: if value of each digit placed in even positions multiplied by 2 is greater than 9, then subtract the value by 9.

After the checksum is calcualted, the 16 digits of credit card number is formed by concatenating the 15 digits, generated out of the random function, with the checksum digit.

cc_digits = strip(cc_digits||strip(checksum))

strip is a built-in function in REXX which is used to remove the leading and trailing spaces.

|| is the operator used for concatenating 2 strings.

I had created a separate routine to calculate the checksum. Once the checksum is calculated, the 16 digit credit card number is passed on to another routine, which will validate the number to check if it is Luhn-compatible or not. If the number is valid, it will be written to a compound variable.

Compound variables:

Variables are treated as compound variables if the variable name ends with a "." (period). All the 500 Luhn-compatible numbers are written to a compound variable named 'out.'. I had used do loops in combination with compound variables to write the numbers. After writing, if you refer to, out.1, it will fetch you the first Luhn-compatible number. Likewise, out.499 will fetch you the 499th Luhn-compatible number. out.0 will give you the count of total number of records present in the compound varaible, 'out.'.

After storing 500 Luhn compatible numbers in the compound variable, I used the EXECIO command to write the compound variable to the output file.

"EXECIO * DISKW outdd (STEM out."

The * and DISKW indicates that all the records from the STEM (i.e., compound variable), OUT. should be written to the output file outdd.

Some useful tips:

Try this website to validate the 500 numbers you generate out of your code, to be Luhn-compatible or not.
I recommend you to use Vista TN3270 terminal for this challenge to logon to the MTM system (IP: 192.86.32.153 PORT: 623) rather than VS code. Coding and running your REXX exec will be a lot easier in TN3270 terminal.
Coding TRACE I in the 2nd line of your REXX Exec and executing the code will result in debugging mode. This might help you in understanding the program's flow, if things are not working out. More about TRACE can be found here.
Bonus hint: While using DO loops to traverse across the even and odd positions of the 19 digit credit card number, you may have to step 2 times in each iteration. The syntax in such cases is as follows:

do i = 5 by 2 to 17

(......)

end

do j = 4 by 2 to 18