SAX-PARSE taking too long (2 Minutes for a small xml file)

Posted by pablo.vivanco on 15-Nov-2017 13:15

Hello, I'm using OE 10.2b08, when I try to parse an XML file it takes about 2 minutes, if I try to parse the same XML file in OE11.6 it takes only 1 second.

DEFINE VARIABLE hSax AS HANDLE NO-UNDO.

CREATE SAX-READER hSax.
hSax:HANDLER = THIS-PROCEDURE.
hSax:SET-INPUT-SOURCE("file","c:\temp\LAN7008173R5_11648_4120=NOMINA-4120.xml").
hSax:SAX-PARSE().   /******* <--- right here takes 2 minutes!  ********/

Is it any know issue that has been resolved by a Hotfix for 102.B08 in parsing XML ? Here's the XML Im having the issue with:  

[View:/cfs-file/__key/communityserver-discussions-components-files/19/4478.LAN7008173R5_5F00_11648_5F00_4120_3D00_NOMINA_2D00_4120.xml:320:240]

Any Help is appreciated, thanx in advance!

Posted by Stefan Drissen on 15-Nov-2017 15:50

I can reproduce the slowness with 10.2B08. If you validate your xml file with for example Notepad++ (using xml plugin) you will see that it also takes a /significant/ amount of time - I think some of the namespace references are timing out or are just very very slow.

If you pull apart the references in your xml file apart you will notice that:

http://www.sat.gob.mx/nomina12

is timing out.

If you disable validation:

hsax:VALIDATION-ENABLED = FALSE.

then with 10.2B I also get a millisecond response.

In both 11.7.1 and 10.2B08 the default for validation-enabled is true - so it seems something has been optimized in higher versions - /possibly/ (since I am also getting authorization errors on the first schemas, 11 isn't even getting as far as the timeout namespaces).

All Replies

Posted by Brian K. Maher on 15-Nov-2017 13:24

You should look at what the callbacks you have defined are doing.  That’s going to tell you where the real issue is.

Posted by pablo.vivanco on 15-Nov-2017 13:33

Thank you, but there's no definition for any of the callbacks , thats the strange thing; the complete code for the test are only the 5 lines I posted

Posted by Stefan Drissen on 15-Nov-2017 15:50

I can reproduce the slowness with 10.2B08. If you validate your xml file with for example Notepad++ (using xml plugin) you will see that it also takes a /significant/ amount of time - I think some of the namespace references are timing out or are just very very slow.

If you pull apart the references in your xml file apart you will notice that:

http://www.sat.gob.mx/nomina12

is timing out.

If you disable validation:

hsax:VALIDATION-ENABLED = FALSE.

then with 10.2B I also get a millisecond response.

In both 11.7.1 and 10.2B08 the default for validation-enabled is true - so it seems something has been optimized in higher versions - /possibly/ (since I am also getting authorization errors on the first schemas, 11 isn't even getting as far as the timeout namespaces).

Posted by Garry Hall on 15-Nov-2017 15:56

Interesting. The underlying Xerces parser was upgraded in 11.1, this might explain the behaviour. I am not aware of anything in the client's XML handling that specifically addressed this.

Posted by pablo.vivanco on 15-Nov-2017 16:31

Thanx so much for your help, I disabled the validation and, as you commented, it took milliseconds to parse the XML. That works for me now, I don't need schema validations.

Posted by Peter Judge on 21-Nov-2017 06:40

Can you test how long it lakes to load the file from disk? Use COPY-LOB or similar?
 

Posted by Stefan Drissen on 21-Nov-2017 06:47

[quote user="Peter Judge"]

Can you test how long it lakes to load the file from disk? Use COPY-LOB or similar?

[/quote]

1 ms - the issue is with how the xml parser is handling timeouts on the namespace references.

Posted by Peter Judge on 21-Nov-2017 06:58

Thanks … I saw your reply about the validator after sending my reply so “Peter Judge Would Like To Recall This Message”  <cough> .
 
 

This thread is closed