Speeding up database inserts

Posted by Gertjan Hendriks on 03-Dec-2014 02:54

Hi all,

We use Corticon Studio and Server version 5.4.1.

We are building a batch process for a monthly run. Because of EDC-limitations we are forced to read data from various sources (SQL Server tables and views, with no associations/ join expressions possible), and do matching within Corticon. This is a less efficient and a little more time-consuming, yet acceptable option. With around 33.000 instances in one specific flow, a matching- and calculation process takes up to around 32 seconds.

The major drawback comes from the data-writing. After walking through the flow with separate rulesheets for reading data, creating non-persistent entity-copies of every read record, and doing calculations on these non-persistent entities, every created CDO is then persisted to the database via the final rulesheet via a persistent entity. With 33.000 records, Hibernate generates 33.000 separate SQL-inserts (one for each CDO), which is very time-consuming. This can take up to 55 minutes, where we only need 32 seconds for calculating and matching.

Is there any way to speed up the database insert process in this situation? Will for instance the High Performance Batch Processor help us with this?

Thank you!

All Replies

Posted by mparish on 03-Dec-2014 03:05

One option to consider is to use a java Service Call Out as the first step in the rule flow which can read the data from the databases and construct the appropriate Corticon objects with associations.
Finally as the last step in the rule flow make a call to another SCO which can do all the database updates in a more efficient manner.
Mike

iPhone

On Dec 3, 2014, at 12:54 AM, "hendrige" <bounce-hendrige@community.progress.com> wrote:

Thread created by hendrige

Hi all,

We use Corticon Studio and Server version 5.4.1.

We are building a batch process for a monthly run. Because of EDC-limitations we are forced to read data from various sources (SQL Server tables and views, with no associations/ join expressions possible), and do matching within Corticon. This is a less efficient and a little more time-consuming, yet acceptable option. With around 33.000 instances in one specific flow, a matching- and calculation process takes up to around 32 seconds.

The major drawback comes from the data-writing. After walking through the flow with separate rulesheets for reading data, creating non-persistent entity-copies of every read record, and doing calculations on these non-persistent entities, every created CDO is then persisted to the database via the final rulesheet via a persistent entity. With 33.000 records, Hibernate generates 33.000 separate SQL-inserts (one for each CDO), which is very time-consuming. This can take up to 55 minutes, where we only need 32 seconds for calculating and matching.

Is there any way to speed up the database insert process in this situation? Will for instance the High Performance Batch Processor help us with this?

Thank you!

Stop receiving emails on this subject.

Flag this post as spam/abuse.

Posted by Gertjan Hendriks on 03-Dec-2014 05:40

Hi Mike,

Thank you for your (quick!) suggestion. I will discuss this option with the IT-department.

Yet a less technical solution (read: No Java-coding) would surely be much preferred, if possible.

[quote user="mparish"]

One option to consider is to use a java Service Call Out as the first step in the rule flow which can read the data from the databases and construct the appropriate Corticon objects with associations.
Finally as the last step in the rule flow make a call to another SCO which can do all the database updates in a more efficient manner.
Mike

iPhone

[/quote]

Posted by Harold-Jan Verlee on 03-Dec-2014 05:52

I'm afraid that the HPBP in this situation will not be an proper solution, as per your use case, within the rules you have to reason over the entire dataset (i.e. clustering). The HPBP seeds records in chunks and only these chunks are processed in internal working memory at a time.

Posted by Chris S. Hogan on 03-Dec-2014 13:06

The scenario is a bit unclear. Are you processing 33,000 records in one single decision service invocation? Or, are you making 33,000 separate decision invocations?
If it is the first scenario, then HPBP is not applicable for the reasons Harold mentioned. If it is the second, then HPBP would be applicable.
 
Christopher S. Hogan
Principal Systems Engineer

Progress

PHONE 646-201-4123
MOBILE 646-243-4282
www.progress.com
Twitter
Facebook
LinkedIn
Google+
 
 
[collapse]
From: Harold-Jan Verlee [mailto:bounce-hverlee@community.progress.com]
Sent: Wednesday, December 03, 2014 6:53 AM
To: TU.Corticon@community.progress.com
Subject: RE: [Technical Users - Corticon] Speeding up database inserts
 
Reply by Harold-Jan Verlee

I'm afraid that the HPBP in this situation will not be an proper solution, as per your use case, within the rules you have to reason over the entire dataset (i.e. clustering). The HPBP seeds records in chunks and only these chunks are processed in internal working memory at a time.

Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse]

Posted by Gertjan Hendriks on 04-Dec-2014 01:45

Hi Chris,

In my case, I am working with 33.000 records being processed in one single decision service invocation. That means that the High Performance Batch Processor (HPBP) is not an option here.

So thus far, a Service Call-out seems to be the only option. Yet I hope there's another, less technical option.

Kind regards,

Gertjan

[quote user="Chris S. Hogan"]

The scenario is a bit unclear. Are you processing 33,000 records in one single decision service invocation? Or, are you making 33,000 separate decision invocations?
If it is the first scenario, then HPBP is not applicable for the reasons Harold mentioned. If it is the second, then HPBP would be applicable.
 
Christopher S. Hogan
Principal Systems Engineer

Progress

PHONE 646-201-4123
MOBILE 646-243-4282
www.progress.com
Twitter
Facebook
LinkedIn
Google+
 
 
[/quote]

Posted by Gertjan Hendriks on 15-Dec-2014 09:08

The story continues.

As no easy solution seems to be at hand at this time, we commenced our search for a solution. Via Support I got pointed towards Progress DataDirect (which is now under consideration) .

Aside of that we are also looking for the use of a Service Call-out (SCO), for the time being. There we found out that EDC uses a custom driverclass ´com.prgs.sqlserver.jdbc.sqlserverdriver´, instead of the JDBC-standard ´com.microsoft.sqlserver.jdbc.sqlserverdriver´. Yet we are unable to register this class from within Corticon Studio (5.4.1), causing an error when trying to run a ruletest.

Main question: Is anyone able to post an example of a Corticon Service Call-out connecting to a (prefereably SQL Server) database connected through EDC? This might help us a lot.

Thank you!

-Gertjan

Posted by mcicel on 15-Dec-2014 09:18

Hi Gertjan,

The JDBC driver used by EDC is the  Progress DataDirect driver.

What error are you getting when you run a ruletest?

Thank you,

Marian

Posted by Gertjan Hendriks on 15-Dec-2014 09:45

Hi Marian,

I just found out about EDC/ DataDirect (thank you Jan!). Thank you :)

The .jar we created is part of the Corticon-project, and we added it to the manifest-file. The error-message we get (and I hope it renders correctly):

com.microsoft.sqlserver.jdbc.SQLServerDriver cannot be found by com.corticon.eclipse.studio.operations.extended.core_5.4.9
Cc|2014-12-15 14:34:21.491|Thread:main|Version: 5.4.1.0 -b6506(5.4.6506)|<username>|INFO|Magi_FilingUnit_SCO|java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver cannot be found by com.corticon.eclipse.studio.operations.extended.core_5.4.9

                at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:501)

                at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:421)

                at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:412)

                at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(DefaultClassLoader.java:107)

                at java.lang.ClassLoader.loadClass(ClassLoader.java:356)

                at java.lang.Class.forName0(Native Method)

                at java.lang.Class.forName(Class.java:186)

[...]

Is this enough information for you?

[quote user="mcicel"]

Hi Gertjan,

The JDBC driver used by EDC is the  Progress DataDirect driver.

What error are you getting when you run a ruletest?

Thank you,

Marian

[/quote]

Posted by Jan Krishnamurthy on 15-Dec-2014 10:40

Hello Gertjan,
 
Corticon is bundled with the DataDirect driver .
 
Please try to include CcThirdPartyJars.jar in the path. The CcThirdPartyJars.jar has the DataDirect Drivers within.
 
This should resolve the com.microsoft.sqlserver.jdbc.SQLServerDriver cannot be found by com.corticon.eclipse.studio.operations.extended.core_5.4.9
 
Thanks,
Jan
 
 
 
[collapse]
From: hendrige [mailto:bounce-hendrige@community.progress.com]
Sent: Monday, December 15, 2014 10:46 AM
To: TU.Corticon@community.progress.com
Subject: RE: [Technical Users - Corticon] Speeding up database inserts
 
Reply by hendrige

Hi Marian,

I just found out about EDC/ DataDirect (thank you Jan!). Thank you :)

The .jar we created is part of the Corticon-project, and we added it to the manifest-file. The error-message we get (and I hope it renders correctly):

com.microsoft.sqlserver.jdbc.SQLServerDriver cannot be found by com.corticon.eclipse.studio.operations.extended.core_5.4.9
Cc|2014-12-15 14:34:21.491|Thread:main|Version: 5.4.1.0 -b6506(5.4.6506)|<username>|INFO|Magi_FilingUnit_SCO|java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver cannot be found by com.corticon.eclipse.studio.operations.extended.core_5.4.9

                at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:501)

                at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:421)

                at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:412)

                at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(DefaultClassLoader.java:107)

                at java.lang.ClassLoader.loadClass(ClassLoader.java:356)

                at java.lang.Class.forName0(Native Method)

                at java.lang.Class.forName(Class.java:186)

[...]

Is this enough information for you?

mcicel
Hi Gertjan,
The JDBC driver used by EDC is the  Progress DataDirect driver.
What error are you getting when you run a ruletest?
Thank you,
Marian
Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse]

Posted by Jan Krishnamurthy on 16-Dec-2014 10:17

Hello Gertjan,
 
In my previous comment ,I meant to say:
To resolve the issue com .microsoft.sqlserver.jdbc.SQLServerDriver cannot be found by com.corticon.eclipse.studio.operations.extended.core_5.4.9
 
Studio must recognize the third party driver jar file.
For this you must create a plugin for the third party driver jar file and add the plugin created for the driver jar in the CORTICON_HOME\Studio\eclipse\plugins directory
 
This way Studio will recognize the jar.
 
See KB article for more details.
http://knowledgebase.progress.com/articles/Article/000033660
 
Thank you,
Jan
 
[collapse]
From: Jan Krishnamurthy [mailto:bounce-jkrishna@community.progress.com]
Sent: Monday, December 15, 2014 11:41 AM
To: TU.Corticon@community.progress.com
Subject: RE: [Technical Users - Corticon] Speeding up database inserts
 
Reply by Jan Krishnamurthy
Hello Gertjan,
 
Corticon is bundled with the DataDirect driver .
 
Please try to include CcThirdPartyJars.jar in the path. The CcThirdPartyJars.jar has the DataDirect Drivers within.
 
This should resolve the com.microsoft.sqlserver.jdbc.SQLServerDriver cannot be found by com.corticon.eclipse.studio.operations.extended.core_5.4.9
 
Thanks,
Jan
 
 
 
[collapse]
From: hendrige [mailto:bounce-hendrige@community.progress.com]
Sent: Monday, December 15, 2014 10:46 AM
To: TU.Corticon@community.progress.com
Subject: RE: [Technical Users - Corticon] Speeding up database inserts
 
Reply by hendrige

Hi Marian,

I just found out about EDC/ DataDirect (thank you Jan!). Thank you :)

The .jar we created is part of the Corticon-project, and we added it to the manifest-file. The error-message we get (and I hope it renders correctly):

com.microsoft.sqlserver.jdbc.SQLServerDriver cannot be found by com.corticon.eclipse.studio.operations.extended.core_5.4.9
Cc|2014-12-15 14:34:21.491|Thread:main|Version: 5.4.1.0 -b6506(5.4.6506)|<username>|INFO|Magi_FilingUnit_SCO|java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver cannot be found by com.corticon.eclipse.studio.operations.extended.core_5.4.9

                at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:501)

                at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:421)

                at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:412)

                at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(DefaultClassLoader.java:107)

                at java.lang.ClassLoader.loadClass(ClassLoader.java:356)

                at java.lang.Class.forName0(Native Method)

                at java.lang.Class.forName(Class.java:186)

[...]

Is this enough information for you?

mcicel
Hi Gertjan,
The JDBC driver used by EDC is the  Progress DataDirect driver.
What error are you getting when you run a ruletest?
Thank you,
Marian
Stop receiving emails on this subject.

Flag this post as spam/abuse.

Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse][/collapse]

This thread is closed