XML Parsing and Generation in COBOL

Sunday, April 14, 20132comments

Earlier   distributed   systems   suffered   from   interoperability   issues because  each  vendor  implemented  its  own  on-wire  format  for  distributed  object messaging.  By  using  XML  as  an  on-wire  standard,  the  two  camps  of  Java/J2EE and .NET/C# now could speak to each other.  Extensible Markup Language (XML) is a extensible,   portable,  and  structured  text  format.  XML  is  playing  an  increasingly important role in the exchange of a wide variety of data on the Web and elsewhere.

Need of XML in COBOL:

With   the   latest   advances   in   mainframe   SOA   technology,  the mainframe is rapidly evolving into an industry standard server, capable of publishing legacy  data and applications as Web services, as well as consuming distributed Web services that reside external to the mainframe. Web services are platform independent interfaces that allow communication with other applications using standard  based internet  technologies such as XML. They provide an opportunity for organizations to reduce the costs and complexities of application  integration  inside  the firewall  and create new possibilities for legacy applications to participate in e-business.

CICS  Transaction  Server  includes  facilities  that  allow  third-party  vendors  to  create adapters   that  can  immediately  enable  legacy  applications  as  web  services.  These facilities also provide additional benefits over gateways, such as improved performance and increased  stability versus their screen scraping counterparts. By using the same industry standard  technologies  as web services, some adapters  make it possible for applications to transparently invoke CICS transactions within web services architecture and receive the resulting data as well-formed XML. For organizations that want to retain the value of their CICS  applications, the combination of XML-enabling adapters and web services offers a practical and powerful integration solution.

Web services are not a trend, but an industry-wide movement that can provide a long- term  solution for companies who want to integrate legacy applications and data with new  e-Business  processes. In the end, companies need to assess the value of the data contained in their legacy applications.

XML tag handling in COBOL:

IBM  offers   tools  that  let  enterprise  developers  adapt  existing  COBOL  business applications so that they can efficiently convert XML messages into native COBOL data and   transform  COBOL  data  into  XML  messages.  These  tools  use  the  new  high- performance  XML  parsing  capabilities  and  XML  GENERATE  statement  of  the  IBM Enterprise COBOL compiler as well as existing COBOL language constructs to achieve the conversion tasks.

Before you can parse an XML document with an XML PARSE statement, you must make the document available to your program. Common methods of acquiring the document are   by   retrieval   from  a   WebSphere   MQ   message,   a   CICS   transient   queue  or communication area, or an IMS message processing queue.

If the XML document that you want to parse is held in a file, use ordinary COBOLfacilities to place the document into a data item in your program:
  • A FILE-CONTROL entry to define the file to your program
  • An OPEN statement to open the file
  • READ statements to read all the records from the file into a data item (either an elementary item of category alphanumeric or national, or an alphanumeric or national   group)  that  is  defined  in  the  WORKING-STORAGE  SECTION  or LOCAL-STORAGE SECTION
  • Optionally the STRING statement to string all of the separate records together into   one   continuous  stream,  to  remove  extraneous  blanks,  and  to  handle variable-length records

XML parsing flow overview

Use these COBOL facilities to process XML input:
  • XML PARSE statement to begin XML parsing and to identify the document and your processing procedure
  • Processing procedure to control the parsing: receive and process the XML events and associated document fragments, and optionally handle exceptions
  • Special registers to receive and pass information:
             XML-CODE to determine the status of XML parsing
             XML-EVENT to receive the name of each XML event
             XML-TEXT to receive XML document fragments from an alphanumeric document
             XML-NTEXT  to  receive  XML  document  fragments  from  a  national document

Writing procedures to process XML

In your processing procedure, code statements to handle XML events.

For each event that  the parser encounters, it passes information  to your processing procedure  in  several  special  registers,  as  shown  in  the  following  table.  Use  these registers to populate the data structures and to control the processing.

When  used  in  nested  programs,  these  special  registers  are  implicitly  defined  as GLOBAL in the outermost program.

Special register


Implicit definition and usage


The name of the XML event



An exception code or zero for each XML event



Text     (corresponding   to    the event         that        the       parser encountered) from the XML document if you specify an alphanumeric item for the XML PARSE identifier

Variable-length elementary category alphanumeric   item;   size   limi of
134,217,727 bytes


Text     (corresponding   to    the event         that        the       parser encountered) from the XML documeniyo specify        a national      item   for    the   XML PARSE identifier

Variable-length elementary category national        item;      size     limit      of
134,217,727 bytes
1.   You cannot use this special register as a receiving data item.
2.   The XML GENERATE statement also uses XML-CODE. Therefore, if you code an XML GENERATE statement in the processing procedure, save the value of XML-CODE before the XML GENERATE statement and restore the saved value after the XML GENERATE statement

Restriction:  A  processing  procedure  must  not  directly  execute  an  XML  PARSE statement. However, if a processing procedure passes control to a method or outermost program by using an INVOKE or CALL statement, the target method or program can execute the same or a different XML PARSE statement. You can also execute the same XML statement or  different XML statements simultaneously from a program that is running on multiple threads.

The compiler inserts a return mechanism after the last statement in each processing procedure. You can code a STOP RUN statement in a processing procedure to end the run unit. However, an EXIT PROGRAM statement (when a CALL statement is active) or a  GOBACK  statement  does  not  return  control  to  the  parser.  Using  either  of  these statements in a processing procedure results in a severe error.

Terminating XML parsing

You can terminate parsing deliberately by setting XML-CODE to -1 in your processing procedure before returning to the parser from any normal XML event (that is, not an EXCEPTION event). You can use this technique when you have seen enough of the document or have detected some irregularity in the document that precludes further meaningful processing.

In  this  case,  the  parser  does  not  signal  any  further  events  although  an  exception condition exists. Therefore, control returns to the ON EXCEPTION phrase if specified. In the  imperative statement of the ON EXCEPTION phrase, you can test whether XML- CODE is -1,  which indicates that you terminated parsing deliberately. If you do not specify  an  ON  EXCEPTION phrase, control returns  to  the  end of  the XML  PARSE statement.

You can also terminate parsing after any XML exception event by returning to the parser without   changing  XML-CODE.  The  result  is  similar  to  the  result  of  deliberate termination  except  that the parser returns to the XML PARSE statement with XML- CODE containing the exception number.

Example: XML tag is:

<Short-Msg>Hello, World!</Short-Msg>

The above information needs to be build in COBOL structure. The working storage Greet copybook will be:


   02 NAME PIC x(20).
   02 PHONE PIC 9(12).
   02 SHORT-MSG PIC X(10).


Use the below code in procedure division of the program.



         WHEN ‘Name’
            MOVE XML-TEXT TO NAME 
         WHEN ‘Phone’
         WHEN ‘Short-Msg’
            MOVE XML-TEXT TO Short-Msg
         WHEN OTHER

Producing XML output

You can produce XML output from a COBOL program by using the XML GENERATE statement. In the XML GENERATE statement, you can also identify a field to receive a count of the number of characters of XML output generated, and a statement to receive control if an exception occurs.

To produce XML output, use:

The XML GENERATE statement to identify the source and target data items, count field, and ON EXCEPTION statement
The special register XML-CODE to determine the status of XML generation

After you transform COBOL data items to XML, you can use the resulting XML output in  various ways, such as deploying it in a Web service, passing it as a message to WebSphere MQ, or transmitting it for subsequent conversion to a CICS communication area.

Generating XML output

To transform COBOL data to XML, use the XML GENERATE statement.

      DISPLAY 'XML generation error ' XML-CODE 
      STOP RUN
      DISPLAY 'XML document was successfully generated.' 

In the XML GENERATE statement, you first identify the data item (XML-OUTPUT in the example above) that is to receive the XML output. Define the data item to be large enough to contain the generated XML output, typically five to eight times the size of the COBOL source data depending on the length of its data-name or data-names.

In the DATA DIVISION, you can declare the receiving identifier as alphanumeric (either an  alphanumeric group item or an elementary item of category alphanumeric) or as national (either a national group item or an elementary item of category national).

The receiving identifier must be national if the CODEPAGE compiler option specifies a code page that includes DBCS characters or the XML output will contain any data from the COBOL source record that has any of the following characteristics:
  • Is of class national or class DBCS
  • Has a DBCS name (that is, is a data item whose name contains DBCS characters)
  • Is an alphanumeric item that contains DBCS characters
Next  you  identify  the  source  data  item  that  is  to  be  transformed  to  XML  format (SOURCE-REC in the example). The source data item can be an alphanumeric group item, national group item, or elementary data item of class alphanumeric or national. Do not specify the RENAMES clause in the data description of that data item.

If the source data item is an alphanumeric group item or a national group item, the source data item is processed as a group item, not as an elementary item. Any groups that are subordinate to the source data item are also processed as group items.

Some COBOL data items are not transformed to XML, but are ignored. Subordinate data items of an alphanumeric group item or national group item that you transform to XML are ignored if they:
  • Specify the REDEFINES clause, or are subordinate to such a redefining item
  • Specify the RENAMES clause
These items in the source data item are also ignored when you generate XML: 
  • Elementary FILLER (or unnamed) data items
  • Slack bytes inserted for SYNCHRONIZED data items
There must be at least one elementary data item that is not ignored when you generate XML.  For  the  data  items  that  are  not  ignored,  ensure  that  the  identifier  that  you transform to XML satisfies these conditions when you declare it in the DATA DIVISION:
  • Each elementary data item is either an index data item or belongs to one of these classes:
         -  Alphabetic
         -  Alphanumeric
         -  DBCS
         -  Numeric
         -  National
That  is,  no  elementary  data  item  is  described  with  the  USAGE  POINTER, USAGE  FUNCTION-POINTER,  USAGE  PROCEDURE-POINTER,  or  USAGE OBJECT REFERENCE phrase.
  • Each data-name other than FILLER is unique within the immediately containing group, if any.
  • Any DBCS data-names, when converted to Unicode, are legal as names in the XML specification, version 1.0.
  • The  data  item  or  items  do  not  specify  the  DATE  FORMAT  clause,  or  the DATEPROC compiler option is not in effect.
An  XML  declaration  is  not  generated.  No  white  space  (for  example, new  lines  or indentation) is inserted to make the generated XML more readable.

Optionally, you can code the COUNT IN phrase to obtain the number of XML character positions that are filled during generation of the XML output. Declare the count field as an integer data item that does not have the symbol P in its PICTURE string. You can use the count field and reference modification to obtain only that portion of the receiving data item that contains the generated XML output. For example, XML-OUTPUT(1:XML- CHAR-COUNT)  references the first XML-CHAR-COUNT character positions of XML- OUTPUT.

In addition, you can specify either or both of the following phrases to receive control after generation of the XML document:
  • ON EXCEPTION, to receive control if an error occurs during XML generation
  • NOT ON EXCEPTION, to receive control if no error occurs
You can end the XML GENERATE statement with the explicit scope terminator END- XML.   Code  END-XML  to  nest  an  XML  GENERATE  statement  that  has  the  ON EXCEPTION or NOT ON EXCEPTION phrase in a conditional statement.

XML generation continues until either the COBOL source record has been transformed to XML or an error occurs. If an error occurs, the results are as follows:
  • Special register XML-CODE contains a nonzero exception code.
  • Control is passed to the ON EXCEPTION phrase, if specified, otherwise to the end of the XML GENERATE statement.
If no error occurs during XML generation, special register XML-CODE contains zero, and control is passed to the NOT ON EXCEPTION phrase if specified or to the end of the XML GENERATE statement otherwise

Enhancing XML output

It might happen that the information that you want to express in XML format already exists in  a group item in the DATA DIVISION, but you are unable to use that item directly to generate an XML document because of one or more factors.
For example:
  • In addition to the required data, the item has subordinate data items that contain values that are irrelevant to the XML output document.
  • The names of the required data items are unsuitable for external presentation, and are possibly meaningful only to programmers.
  • The definition of the data is not of the required data type. Perhaps only the redefinitions (which are ignored by the XML GENERATE statement) have the appropriate format.
  • The  required data items  are nested  too  deeply within irrelevant subordinate groups.  The XML output should  be "flattened"  rather  than  hierarchical  as it would be by default.
  • The required data items are broken up into too many components, and should be output as the content of the containing group.
  • The group item contains the required information but in the wrong order.
There are various ways that you can deal with such situations. One possible technique is to define a new data item that has the appropriate characteristics, and move the required data  to  the  appropriate  fields  of  this  new  data  item.  However,  this  approach  is somewhat laborious and requires careful maintenance to keep the original and new data items synchronized.

An alternative approach that has some advantages is to provide a redefinition of the original group data item, and to generate the XML output from that redefinition. To do so, start from the original set of data descriptions, and make these changes:
  • Exclude elementary data items from the generated XML either by renaming them to FILLER or by deleting their names.
  • Provide more meaningful and appropriate names for the selected elementary items and for the group items that contain them.
  • Remove unneeded intermediate group items to flatten the hierarchy. 
  • Specify different data types to obtain the desired trimming behavior.
  • Choose a different order for the output by using a sequence of XML GENERATE statements.
The  safest way  to  accomplish  these changes is  to  use another copy of  the  original declarations accompanied by one or more REPLACE compiler-directing statements.

Handling errors in generating XML output

When an error is detected during generation of XML output, an exception condition exists.  You can write code to check the special register XML-CODE, which contains a numeric exception code that indicates the error type.

To handle errors, use either or both of the following phrases of the XML GENERATE statement:
If you code the ON EXCEPTION phrase in the XML GENERATE statement, control is transferred to the imperative statement that you specify. You might code an imperative statement, for example, to display the XML-CODE value. If you do not code an ON EXCEPTION phrase, control is transferred to the end of the XML GENERATE statement.

When an error occurs, one problem might be that the data item that receives the XML output  is  not large enough. In that case, the XML output is not complete, and special register XML-CODE contains error code 400.

You can examine the generated XML output by doing these steps:
  1. Code the COUNT IN phrase in the XML GENERATE statement. The count field that you specify holds a count of the XML character positions that are filled during XML generation. If you define the XML output as national, the count is in  national character positions (UTF-16 character encoding units); otherwise the count is in bytes.
  2. Use the count field with reference modification to refer to the substring of the receiving data item that contains the generated XML output.
For example, if XML-OUTPUT is the data item that receives the XML output, and XML- CHAR-COUNT is The count field, then XML OUTPUT(1:XML-CHAR-COUNT) references the XML output.

Example: XML tag is:

<Short-Msg>Hello, World!</Short-Msg>

Generate the above tag using cobol XML generate statement. The working storage Greet copybook will be:
   02 NAME PIC x(20).
   02 PHONE PIC 9(12).
   02 SHORT-MSG PIC X(10).

01 WS-TAG PIC X(1000).


Variables names that are used above should be same as tag names. Use the below code in procedure division.


The variable WS-TAG will contain the generated tag and WS-TAG-LENGTH will have tag length.

Share this article :

+ comments + 2 comments

Post a Comment

Support : Creating Website | Johny Template | Mas Template
Copyright © 2011. Atom's Arena - All Rights Reserved
Template Created by Creating Website Published by Mas Template
Proudly powered by Blogger