What is DFSORT?
DFSORT is a high-performance sort, merge, copy, and data manipulation utility used on IBM mainframes. It's an incredibly versatile tool that can handle a variety of data processing tasks, including the generation of SQL queries from a dataset.
The Scenario
Imagine you receive an Excel spreadsheet with thousands of rows, each containing the current and target values for specific columns in your DB2 table. Your task is to update the table with these target values. The sheet consists of details about Employees and the task is to update the current salary with the target salary.
Preparing the Dataset
First, convert the Excel spreadsheet into a format suitable for processing in Mainframe, such as a comma-separated values (CSV) file. Once you have the CSV file, you need to upload it to the mainframe and create a dataset (INPUT.DATA
) that DFSORT can process. At the site I work for, I use WS FTP Pro software to download/upload files from/to Mainframe server.
Assuming your CSV data looks like this:
001,John,Smith,Sales,5000,5500
002,Jane,Doe,HR,6000,6500
003,Mike,Johnson,IT,5500,6000
...
Each field is:
- Employee ID: Position 1-3
- First Name: Position 5-10
- Last Name: Position 12-17
- Department: Position 19-24
- Current Salary: Position 26-30
- Target Salary: Position 32-36
Reformatting the CSV file
//REFORMAT EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=INPUT.CSV,DISP=SHR
//SORTOUT DD DSN=INPUT.DATA,DISP=(NEW,CATLG,DELETE),
// SPACE=(CYL,(1,1)),UNIT=SYSDA
//SYSIN DD *
OPTION COPY
INREC PARSE=(%01=(ENDBEFR=C',',FIXLEN=3),
%02=(ENDBEFR=C',',FIXLEN=6),
%03=(ENDBEFR=C',',FIXLEN=7),
%04=(ENDBEFR=C',',FIXLEN=6),
%05=(ENDBEFR=C',',FIXLEN=5),
%06=(FIXLEN=5)),
BUILD=(%01,3X,%02,2X,%03,X,%04,2X,%05,2X,%06)
/*
Explanation
OPTION COPY
: Instructs DFSORT to copy the input records.INREC PARSE
: Defines how to parse the input CSV records:%01=(ENDBEFR=C',',FIXLEN=3)
: Parses the first field (Employee ID) up to the comma and with a fixed length of 3.%02=(ENDBEFR=C',',FIXLEN=6)
: Parses the second field (First Name) up to the comma and with a fixed length of 6.%03=(ENDBEFR=C',',FIXLEN=7)
: Parses the third field (Last Name) up to the comma and fixed length of 6.%04=(ENDBEFR=C',',FIXLEN=6)
: Parses the fourth field (Department) up to the comma and fixed length of 6.%05=(ENDBEFR=C',',FIXLEN=5)
: Parses the fifth field (Current Salary) up to the comma and fixed length of 5.%06=(FIXLEN=5)
: Parses the sixth field (Target Salary) with fixed length of 5.
BUILD
: Rebuilds the records with fixed positions and adds spaces as needed.Generating SQL Queries with DFSORT
INPUT.DATA
), we can proceed to generate the SQL queries: //SORTSTEP EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=INPUT.DATA,DISP=SHR
//SORTOUT DD DSN=OUTPUT.SQL,DISP=(NEW,CATLG,DELETE),
// SPACE=(CYL,(1,1)),UNIT=SYSDA
//SYSIN DD *
OPTION COPY
OUTFIL BUILD=(C'-- UPDATE FOR ',7,6,X,15,7,80:X,/,
C'UPDATE EMPLOYEES ',80:X,/,
C' SET SALARY = ',38,5,80:X,/,
C' WHERE EMP_ID = ',1,3,80:X,/,
C' AND SALARY = ',31,5,80:X,/,
C';',80:X,/,
80:X)
/*
Explanation
OPTION COPY
: Instructs DFSORT to copy the input records.OUTFIL
: Defines the output file, the default being SORTOUT.BUILD
: Specifies the format of the output records. Here's a breakdown:C'-- UPDATE FOR '
: Adds a comment with the employee's name.7,6,X,15,7,80:X,/,
: Extracts the first name and last name. Adds a line break (/) after 80th byte.C'UPDATE EMPLOYEES ',80:X,/,
: Static text for the SQL SET statement. Adds a line break (/) after 80th byte so that further statements of the SQL can be written in the next line.C' SET SALARY = ',38,5,80:X,/,
: Static text for the SQL SET statement. Extracts the target salary value from position 38, length 5 and places it after the SET statement. Adds a line break after 80th byte.C' WHERE EMP_ID = ',1,3,80:X,/,
: Static text for the WHERE clause. Extracts the employee ID from position 1, length 3 and places it after the first column in the WHERE clause. Adds a line break after 80th byte.C' AND SALARY = ',31,5,80:X,/,
: Adds a condition for the current salary. Extracts the current salary value from position 31, length 5. As usual, a line break is added after the 80th byte.32,5
: Extracts the target salary value from position 32, length 5.C';',80:X,/,
: Adds a semicolon to end the SQL statement. A line break is added after the 80th byte.- 80:X: Adds an empty line.
The Output
After running the DFSORT step, the OUTPUT.SQL
dataset will contain SQL update statements like this:
-- UPDATE FOR JOHN SMITH
UPDATE EMPLOYEES
SET SALARY = 5500
WHERE EMP_ID = 001
AND SALARY = 5000
;
-- UPDATE FOR JANE DOE
UPDATE EMPLOYEES
SET SALARY = 6500
WHERE EMP_ID = 002
AND SALARY = 6000
;
-- UPDATE FOR MIKE JOHNSON
UPDATE EMPLOYEES
SET SALARY = 6000
WHERE EMP_ID = 003
AND SALARY = 5500
;
Conclusion
By leveraging DFSORT’s powerful data manipulation capabilities, you can automate the generation of SQL queries for bulk updates to your DB2 tables. This method not only saves a significant amount of time but also minimizes errors associated with manual SQL writing. Whether you’re handling hundreds or thousands of updates, DFSORT provides a robust and efficient solution for your data processing needs on the mainframe.
There are other ways to generate bulk SQL queries. In the next blog post, I'll show you how to achieve the same task using REXX.
Hope this helps. Should you have any questions, please leave them in the Comments section below.
No comments:
Post a Comment