Hello All,
Starting out in OERA and I've got a couple of questions. I have to admit I'm abit over taken with the complexity and the amount of code that needs to be written and can't help wonder if it's going to be a case of maybe 1% who even understand what it is.
About fetchWhere( ) in the BE and it's DAO, where what ?
Is the where clause against the physical schema buffers and fields ? If it is against the physcial schema what's the point in the layers separation ? If the BE or even worse it's API would need to know and reference the physical schema and tie it or the rest of the application to the physical source.
If it's against the logical view where is the translation done ? Another thing is that the where clause would, possibly, need to be refactored or broken down into multiple where clauses if the data source is made up of multiple buffers. Breaking the where clause may not be very hard to do if it was just predicates/conditions and AND operators but is abit harder to do with real world where clauses with parethesis and OR operators.
One more. Because of how the dataset is filled maybe similar to an outer-join and if there was a filter on a data-source down the line it would not limit the records created by the data-sources that were higher up, similar to an inner-join. What I'm trying to ask can you only have a filter, in this method, on only the main or first data-source.
Many thanks.
That issue I raised here http://www.psdn.com/library/thread.jspa?threadID=3576&start=45&tstart=0. John Sadd's reply at the bottom of that page:
"...
In addition, the materials out on PSDN, among other things, show the mapping process between a filtering request as expressed on the client and what it gets mapped into when it gets back to where the physical data is. If the example shows only a fieldname change, this at least is a placeholder for larger changes that the back-end data access logic would need to be prepared to deal with in a real application.
..."
I guess that mapping is a just a field name replacement....
Since OERA is a Reference Architecture and not an implementation, the answer to all of your questions is "that depends how you implement it".
That said, I'm going to assume you're asking specifically in regard to the OERI (OpenEdge Reference Implementation) - AutoEdge.
About fetchWhere( ) in the BE and it's DAO, where what ?
The where clause is passed to the BE in a context temp-table that accompanies the call.
Is the where clause against the physical schema buffers and fields ?
Without digging back through all the AutoEdge code, I think that AutoEdge didn't make the distinction here. That said, I'm working on a project where we're doing the mapping in a Data Access Object super procedure. To this end we've used the DATA-SOURCE-COMPLETE-MAP attribute.
Another thing is that the where clause would, possibly, need to be refactored or
broken down into multiple where clauses if the data source is made up of multiple buffers.
In our case, we don't have to do too much refactoring as we've been able to do filtering on fields that are returned by the Data Source queries - not values that are calculated in after row fill or after table fill events. Of course, we could use these places, removing any results that don't fit the given filtering criteria, but that could really kill performance.
Breaking the where clause may not be very hard to do if it was just predicates/conditions and AND operators but is abit harder to do with real world where clauses with parethesis and OR operators.
Very true. That's why we've restricted our where clauses to only allow AND's. That's a luxury that not every project would have
One of the things that would actually really help us here though would be an ABL Query Optimiser. If we're not filtering on indexed fields in the first table of the query, we would have to reorganise the whole query to make it into something that performs. Another luxury of this project (to date. Let's hope that doesn't change too soon!)
What I'm trying to ask can you only have a filter, in this method, on only the main or first data-source.
As always, it depends on your implementation. In our current project, we can filter on any level of the ProDataSet, but for example batching only makes sense (for us) at the top level. We could of course implement that too if need be, but we haven't found a need yet.
As for outer join like stuff, it probably depends on how the tables in your ProDataSets are related and whether the relations are active at the time of the fill. With respect to AutoEdge, perhaps someone who has more recently looked at the code could comment on what was done.
I guess that mapping is a just a field name replacement....
It might not be quite so easy, for example if you moved a field into another table. That said though, as long as you can add the new table to your Data Source's query, it's not that hard at all.
Hard might be something like: you used to store the order's total value in the order header record, but you decided to reduce redundancy, removed it from the header and now calculate it each time. Now you try to filter the orders to only show those with a total value of more than $X.
Of course, this is not a new problem - we've always had to deal with this one in some way.
Thanks guys.
That did alot to reassure me. Progress had a development team, 4 architects and a good sized amount of time and they came up with something that's what, still academic ? What do they expect us to do, the consultants, private developers, small development teams to do ?
It may be the frustration talking but here we are AGAIN following the next big thing, the hype and it's almost everything we hear about. And 5 years from now there may only be a few places that started out OERA projects even less people who actually understand it and we'll be reading end of life and migration articles from OERA because it's too rigid, complicated or whatever.
I don't know if it'll catch on. For one it's faaarrrrr from simple (and documentaton have never been a strong side), for that reason alone I don't think it's going to be anything like a common practice and for example we'll be seeing regular discussions on OERA @PEG. It just doesn't seem to me simplicty and practicality get the attention they deserve (if it's not simple it simply won't be ... or how ever the saying goes). And the thing is that apparently it's still not completely thought through.
In regards to a query optmizer I wouldn't hold my breath. We're still asking for basic functionality in queries, specifically CAN-FIND and BREAK BY. It's like some other database vendor saying let's takeout EXIST and GROUP BY and see how they coupe now. My feeling is that Progress just doesn't know enough about queries. ABL is very good and maybe even elegant at transaction processing but if you ever have a query that's abit more complicated then simple data validation it's just not good enough. If it's the language features and even the database, for example, messaging they're just not good enough.
Kind of funny since every performance tuning book I've read starts something along the lines of ... from my experience the biggest performance factor in applications is the database access and by far out of that are the queries.
On a side note we have been working on a 4GL/ABL (V9 and above) query optimizer. Mapping from one view to another will not be in the first release but the where clause is entered for the entire query, although, it can also be set specifically for buffers among other features like a dynamic execution plan, special scans etc. We will be looking for beta sites in 2 to 3 month and I'm very interested to talk with anyone who'd like to tryout the product and participate.
My first reaction is to use parameters instead of a where clause similar to a report that can also be freeform conditions that can be mapped to physical fields and maybe parameters that are abit more complicated then just a simple condition.
Maybe even not to bother with DATA-SOURCE objects just use one query that can be optimized according to the criteria and CAN-FINDS to fill the dataset. For one you could have a filiter on the entire dataset and not just the first data source, which to me seems far too rigid and far too unrealistic. For example, think of having a dataset with an order and orderline temp-tables, can you think of not allowing for filtering the orderline for example by item ? I just hope DATA-SOURCE won't turn out to be a language wart.
I really appreciate the help guys. Still not sure about OERA.
To rephrase what I'm trying to say perhaps more clearly, I'm worried we're going towards ADM3.
And it's not that ADM2 didn't have uses, I recently used ADM2 on a small project. But at the end of the day it was too complicated to be used by most people (and documentation was not good). And it was too rigid and didn't pass the real world test.
If I would ask for one thing from the higher ups is if you want people to be able to use it make it simple. And we'd like easy to understand, preferably short, documentation.
I think the first thing you need to do here is to be clear that OERA <> AE. AE is an exercise that was intended to illustrate one version of OERA, but it is certainly not definitive and, if you have been following these forums, there is considerable dispute about how complete and correct it really is in terms of best practices. I think people need to be very aware of these issues and consequently cautious about trying to use AE as a model for a production system. And this is why we have been having the discussion about a requirements document, a solution components document, and some sample code to help provide a reference which is closer to a production ready model.
Which said, AE is a non-trivial accomplishment that illustrates a lot of interesting and important principles. It clearly is good enough that it has provoked a lot of detailed discussion and I think it could lead to a lot more. When something misses the point badly enough, there isn't much to say but that it missed the point. I don't think that is true of AE at all.
I think that some of what you are wrestling with is the same kind of issue we seem to have in the discussion of OO OERA. I.e., there are those of us who want to encapsulate data and behavior into an entity object and to have all of the application except the DA layer relate to the contract of that entity object, not to the database at all. The other view is that ABL has lots of nice structures for what amount to local representations of relational data and we should continue to take advantage of those, even though it means encapsulating logic separate from data and having to have shared definitions of the data structure. As you have seen, this latter approach tends to result in seeing to distribute awareness of the stored data structure through the application ... technically, one can include name changes and denormalization and such, but the simple thing is to make the two structures very, very similar. Of course, there is nothing wrong with having your database schema highly congruent with your object structure ... it is desirable in fact. But, I think you need careful discipline to make sure that you keep the separation. AE doesn't do that very clearly and it is one of the things I dislike about it. But, that doesn't mean that we can't define something better and that you can do something better in your own implementations. I think that once you get sorted out how it should work, you will find that it is easier, not harder. It just takes some work to get there.
deleted
Message was edited by:
Alon Blich
I wouldn't say that no one knows how to implement it. OERA is really no different than the architectural approach I was introduced to with Forté in 1995 and it wasn't new then. What it might be fair to say is that there are not yet any published ABL examples illustrating implementing OERA best practices. AE is a start, but more as a foil for discussion than a finished work.
While I realize that tackling OO and OERA at the same time might seem like a lot, I think there are some very good reasons to go that route. These include:
Most patterns that one can read about and learn from are strongly tied to OO concepts. Not that they can't be applied in .p code, but they often need "translation".
Modeling tools can be a big help in sorting out OERA issues and designs and essentially all modern modeling tools are inherently OO. Yes, there is an EA set of models for the .p AutoEdge, but again it is not working with the natural mode of the tool.
There is a lot of synergy between OO design principles and OERA design principles. Thinking about your components in an OO way is a good running head start to thinking about them in an OERA way and vice-versa. This may not be quite as obvious from the DOH/ROH examples.
Thanks Thomas,
I'm looking at a future project that I'll be starting in a couple of month, relatively small Saas project.
And I'm looking at using OERA, though, it probably won't be OO.
So I'll probably have lots of other questions. Thanks.
Consulting and mentoring are available ....
Hi Alon
As the person who's prime responsibility is the OERA, and one of those involved in AutoEdge, let me try answer some of your points.
That did a lot to reassure me. Progress had a
development team, 4 architects and a good sized
amount of time and they came up with something that's
what, still academic ? What do they expect us to do,
the consultants, private developers, small
development teams to do ?
I'm not sure where this figure of 4 architects comes from, I don't believe we've ever stated how many architects, developers etc where involved, but I can tell you it wasn't as big as people seem to be imagining. At any one time there was 1 lead architect, plus some development resource, so please don't have impression you need an army of technical resource to do OERA. Which leads onto your next point..
It may be the frustration talking but here we are
AGAIN following the next big thing, the hype and it's
almost everything we hear about. And 5 years from now
there may only be a few places that started out OERA
projects even less people who actually understand it
and we'll be reading end of life and migration
articles from OERA because it's too rigid,
complicated or whatever.
Firstly, we need to distinguish between OERA and implementations. The OERA is a Reference Architecture, it is a set of guidelines on how we (Progress) believe applications should be architected. It has a bunch of definitions and explanations of what each component is, why we think they're valuable, and things to consider when designing these components. None of this is supposed to be rigid in any way, shape or form, other than providing what we feel is a valid Reference Architecture.
The next level down are implementations of these concepts. As Jamie has already commented, there is no one right way to implement these concepts. It's all about what is right for your given situation. Be that your application, your current technical resource, your future direction, etc. To that end, what we've done, and will continue to do, is provide ideas and examples on how some of these components could be implemented. For example we have the original work done by John Sadd (http://www.psdn.com/library/kbcategory.jspa?categoryID=289) which are small samples that are procedural based. We have AutoEdge (http://www.psdn.com/library/kbcategory.jspa?categoryID=298) which takes some of the concepts and code from that initial work by John and expands it into more of an application example within a given described business case and a great amount of documentation. Just this week we have posted some new class based work (http://www.psdn.com/library/kbcategory.jspa?categoryID=1212). The point of this isn't so say, hey aren't we great look at all the content we've done, but to say, look there are different ways you can approach this, and here are some ideas and thoughts for you to take and adapt to your situation. Again, I would strongly argue that none of this is rigid. Nowhere are we telling you that you have and must do it this way! Oh, and no we're not building ADM3 !!!! (although some may argue we should)
I don't know if it'll catch on. For one it's
faaarrrrr from simple (and documentaton have never
been a strong side), for that reason alone I don't
think it's going to be anything like a common
practice and for example we'll be seeing regular
discussions on OERA @PEG. It just doesn't seem to me
simplicty and practicality get the attention they
deserve (if it's not simple it simply won't be ... or
how ever the saying goes). And the thing is that
apparently it's still not completely thought
through.
One of my postings in another Thread (http://www.psdn.com/library/thread.jspa?threadID=3232&tstart=0) talks about what's coming, and as I said in that posting, we realize that the material posted so far could be considered advanced or appear complex. So in the very near future (the next couple of weeks all things being equal) we will be posting the first phase of a new set of material entitled 'Architecture Made Simple', where the aim is very much to simplify and help introduce ideas and code in terms and coding concepts that most people are using today when they create ABL applications. So hopefully in a couple of weeks this will add even more reassurance.
I really appreciate the help guys. Still not sure
about OERA.
It's my, plus a group of other's job to make sure that we help as much as possible with making the concepts and ideas around OERA as simple and as consumable as possible. By using the forum, getting feedback, listening, answering and producing material I hope we achieve that. If there are specific questions then please continue to use the forum. If there are comments or questions of the material, please continue to use the forum. As others have said, we're starting to have some deep and meaningful discussions which although can get a little heavy and loud sometimes, are hopefully valuable, but at the same time we have to be mindful of the simple stuff as well
Mike
As others have said, we're starting to have some deep and meaningful discussions which although can get a little heavy and loud sometimes, are hopefully valuable, but at the same time we have to be mindful of the simple stuff as well
Mike
If nothing else, opening the channels of communications, and keeping them open, is one of the huge potential benefit from this effort.
...
If nothing else, opening the channels of
communications, and keeping them open, is one of the
huge potential benefit from this effort.
And communication is a two way thing, so please keep it coming. As I've said before, please encourage others to join in. It's only by having a strong community with a good communication channel that we can not only validate what we're striving to achieve, but also make sure that we're delivering on what is needed, not what we think is needed.
Mike, I certainly think that it is a great idea to be thinking along the lines of what you can do to create OERA for Dummies, but I think it is also worth noting that there is really nothing about OERA that is more complex than the world it is trying to respond to. The truth of the matter is that the development target has simply become more complex as the years have gone by and the old, simple, monolithic architectures that many of us grew up on simply don't cut the mustard any more. It isn't that SOA and ESB are virtuous for the enhanced functionality they provide, although that is true to, it is also gotten to be that they are essential. In this context, OERA is itself already helping to simplify the complex requirements of modern applications because it gives us a strategy, an approach for managing these complex requirements by partitioning and encapsulating the application into manageable units. Once one gets the idea, actually producing these components becomes pretty straightforward and complex requirements become straightforward to achieve. There is just some ramp up in coming to grips with the issues.
Mike, I certainly think that it is a great idea to be
thinking along the lines of what you can do to create
OERA for Dummies, but I think it is also worth noting
that there is really nothing about OERA that is more
complex than the world it is trying to respond to.
True, but like anything new, it can appear daunting at first, and our aim with the "Architecture Made Simple' is to try and remove that initial fear.
And, I think that is a great goal. I just think that people should simultaneously realize that modern development architecture is simply more complex than it used to be and so one needs to set aside some time and resources to come to grips with it. Just because it isn't easy doesn't mean that it isn't essential.
I'm probably not able to demystify this, but I'm convinced that SOA actually can simplify application building and also query logic. One of the main benefits of SOA is that it allows separation of concerns. The layered architecture and the data access layer allows you to hide query complexities. You still need to implement complex queries in the data access layer, but allowing them to be triggered from less complex input might still simplify both the implementation and the usage.
It's easy to get lost in infrastructure and details of parameter usage in our samples. I suggest you focus on the data flow first. The OERA is really all about about data and business logic.
What is new is the introduction of the logical layer. At first one might see this as an additional complexity, but it really is the key to simplication and allows you to define your schema the way the world should see it, but also purposed for the actual need. This does require that you put some effort into defining the logical schema. The very need of a complex query might be a hint that perhaps there's a need to keep some of the complexity under the covers and simplify the conceptual schema.
You should also consider abstracting and simplifying the way a complex query is presented and triggered. These are not new issues or problems. I've seen many Progress applications that have sucessfully managed to abstract complex queries into an understandable and usable UI (meaning that the user do not have to type anything that has complex expressions and/or nested parenthesises). With SOA you basically need to provide this abstraction as part of the Service Interface instead.
When all this is said there is certainly a need for some advanced mapping of client query expressions to data access expressions.
Some of our query mapping experiments have already been published on PSDN.
Somewhere in the audting examples you will find query classes that supports mapping of complex queries. It is limited and rather hastily implemented from a mix of ADM2 query manipulation working together with a primitive query parser and it only deals with one-to-one field mappings.
BUT it does this across tables and handle any number of parenthesis and both AND and OR. It can also be used for query join optimization for more complex data sources. (I don't think this is shown in these samples though)
It actually works in many cases, so I have to add a disclaimer that it is meant as a sample of what one could do in a real implementation. I'm not at all suggesting that you should try to have 100% automatic query mapping, but that you can get very far with a 98% solution together with abstraction that allows simplified input to invoke more complex queries.
WARNING: These samples are not documented. It did not need to be for the Auditing samples...
The plan is to include this in a whitepaper that explains ADM2's OERA support, but the date for this is unknown.
I think you are absolutely right, Haavard, figuring out the whole architecture may be complex, but once the rules are set up, the actual development of the components is actually easier. Not only that, but there is a potential for additional specialization such as that we have had where a separate UI layer allows one person to be particularly good at build UI or GUI (or even ChUI) interfaces while someone else is better at handling the back end logic, now there are more types of well-defined components and a developer can be producing top level work without having to master the full range of skills.
Thank you to everyone who replied,
Thank you for all the explanation guys.
Just to clear up a few things ... I fully realize the OERA is a non specific abstract reference model and OERI and AutoEdge are sample, reference implementation. I'm certainly not against separating BL/UI, physical and internal view, or have a problem with Service Adapter/Interface and BE and DAO in general etc.
I specifically had a problem with what still looks like a very rigid, maybe even contradictory to the OERA suggested implementation of the fetchWhere( ) method and how useful DATA-SOURCE objects are.
I personally had, I believe, more then my fair share of experience with ADM2 and Dynamics. I won't be voting for ADM3 and I hope Progress isn't. I don't think it was the best investment for what goes maybe more then a decade back. There many other important things to invest in.
Although, I believe Progress should get back to the business of writing reporting application, not just applications focusing on transaction processing.
As to your fetchWhere() issues, I think you have expressed some valid concerns and I think these have a lot to do with what I have been calling the DOH or ROH approach in the OO OERA discussions. The existing AE code comes from a similar philosophy, even though it doesn't yet use proper objects, and that your concerns are another symptom of not having fully encapsulated the entities in the way I believe they should be. But, I think the right solution here is to get the right mindset and then mechanisms will fall intor place. When TT or PDS are near clones of the RDBMS structure, it is going to be hard to keep people from thinking about it in the same way that they did when DB access from everywhere was the norm.
One the DataSource front, I think you will see in the OO OERA materials that there has been some shift in emphasis from AE to at least allow a more integrated or unified DA strategy. However, this isn't all good and I think it needs more exploration. However, I think we need some of the basics in place before we get very far on that.
Rather than attempting to come up with ADMn, I would hope that PSC moves to help in creating a library of useful framework components and to advancing tools to generate code from models, but in a way which can be easily adjusted to the needs of the individual shop. T4BL is not a big step in this direction.
Interesting that you should have this focus on reporting. I would have thought that many shops were moving in the direction of the use of third party reporting tools rather than writing ABL code at all. To be sure, I think there are some improvements we should see in things like query optimization and no-index reads, but I want those things more for interactive functions than reporting.
I truly believe that they will never have a true understanding of what features are needed if they don't build these type of applications.
4GL is fast, simple, maybe even elegant when it comes to transaction processing but whenever you come across a query that's abit more complicated then simple validation data it's just not good enough.
Mainly because that's almost all they've been doing for as far as I can remember, frameworks, architectures, articles etc. Heck, I'm not even sure OERA and especially the OERI are compatible or even should be with reporting and BI. It's just seems to me it's not really that much apart of the consideration.
It's not just an optimizer, although, it's probably the most important one. Query objects are missing fundamental capabilities (CAN-FIND and BREAK BY), other types of scans (NO-INDEX etc.), the 4GL remote connection implementation for multiple buffers queries (that IMHO a bug), maybe even set related operations like sub queries etc. All of these features may not be an issue with transaction processing but they're absolutely essential for real world queries.
More then that there are no query focused benchmarks, although, I don't think there can be meaningful query benchmarks without a query optimizer, it's not just about how fast records can be read. If I remember at the last benchmarks Tom Bascom asked for read related benchmarks stats and it was more or less dismissed, personally I find that crazy.
Almost every database performance tuning book I've read starts out something along the lines of ... from my xx years experience the biggest performance factor is database access and from that by far the biggest one is queries. We've been talking about just NO-INDEX scans for I'm starting to forget how long but we're still getting "who's going to use that" (BTW one of my favorites responses).
I have to be off now (it's almost 2am here) but I'll hopefully continue tomorrow.
Thanks.
While I am certainly with you in wanting these feature in the ABL query, I don't know that I ever expect to write a report in ABL again. The closest I've come in years to doing that is an ABL front end which did a lot of complex calculations in connection with processing author royalty statements which then exported all that into a flat file and handed off to Actuate. For anything else, including some really hideously complex reports, everything was done in Actuate straight from the database.
Of course, I can see where it could be nice to have an ABL component to gather the data and then pass over a dataset. That would certainly facilitate incorporating security issues and such. I haven't tried that yet, though.
I do agree that these kinds of queries are something we should get included in the requirements document for the data access layer. One of the things it illustrates is the range of different types of access one might want. Of course, then one of the possible answers might be using a SQL data source buried in there!
> Of course, then one of the possible answers might be using a SQL data source buried in there!
My feeling is that having SQL access instead of working on these features in ABL maybe be the fast answer but not the right one.
I know you're suggesting having both but I don't think that will happen and I'm not really excited if it will.
There are a couple of reasons I keep bringing up this idea. One, of course, is that it is a potential quick fix for set oriented operations. Another is that ... after all ... SQL was designed for set oriented operations and so it isn't necessarily a bad thing, if it can be encapsulated. But, another is that if we can move to this kind of controlled connection, it opens up the possibility of data sources addressing different databases by different technologies as needed. That seems to me to be an increase in capability.
Weren't you going to bed?
Here's another example why Progress needs to develop reporting tools.
http://www.psdn.com/library/thread.jspa?threadID=3726&tstart=0
Simply put, it's every thing that's missing from the 4GL.
Reporting tools or the features needed for these types of applications are everything that's missing from the 4GL, not just for the database.
There will never be a true understanding and internalizing for what we need if they only develop for half an application.
I think there might well be an audience for someone to develop an ABL package to make it easy to interface to Excel, but I don't know that it needs to come from PSC.
For making pretty reports, I wouldn't consider using ABL unless it was in the rare context where I needed to do some pre-processing and create an interface file. Nothing you could reasonably do to ABL would make it a better tool for reporting than something like Actuate.
Sorry for the late response, it's the holidays here
The lack of native ZIP support needed for generating documents is just another example. At the very least there would have been a wider understanding/awareness to these needs. They could have gotten involved, offered solutions/alternatives, published articles etc.
But I'm not fooling myself with wishful thinking I'll probably have to write something from scratch, as usual.
And you know what the fetchWhere( ) issues is also another good example.
Personally, it's the first thing that popped into mind when I first read about DATASET's and DATA-SOURCE's even before reading about the OERI. I believe this would have never happened if they had more experience in this side of the application.
Bottom line, the benchmarks will still only measure writes not reads, we will still be getting "Whos'e going to need that ??" responses and so on.
I don't think that by now it would be far fetched to say that in some but very important areas we're coming to decades behind, most notibly are queries, their lack of capapbilities, database implementation incompatibility with queries etc.
And that is a direct result of not having experience in this side of the application.
Strategy wise they need experience in this side of the application.
There's more to applications the just **** transactions (please excuse me but some things need to be said ).
I understand your frustrations but I moderated the message
Message was edited by:
Mike Ormerod
And you know what the fetchWhere( ) issues is also
another good example.
Personally, it's the first thing that popped into
mind when I first read about DATASET's and
DATA-SOURCE's even before reading about the OERI. I
believe this would have never happened if they had
more experience in this side of the application.
If we're still talking about reporting here, then have you looked at the work with Crystal and AutoEdge? Crystal can use datasets as a native object, and so in AutoEdge we populate a DataSet and pass it onto Crystal through a proxy layer for reporting. Or am I missing the point?
In an earlier post you mention "Mainly because that's almost all they've been doing for as far as I can remember, frameworks, architectures, articles etc. Heck, I'm not even sure OERA and especially the OERI are compatible or even should be with reporting and BI. It's just seems to me it's not really that much apart of the consideration."
Again, is this in relation to just reporting, or building applications in general?
Mike, there is some of what Alon says that I agree with and some I don't. In terms of actually creating the report, I think that using Actuate or possibly even Crystal is a fine way to go and trying to add all those pretty print features into ABL is wasted effort. We also have PDF Include if one wants to do it all in ABL. It is a solved problem.
But, there are times when getting the data directly via SQL doesn't actually work. It isn't common or typical, but it happens. For this the dataset approach or some kind of interface is needed and the dataset has to be created in ABL. There are also query requirements where the output is on screen and there is no real role for a third party reporting tool. There one needs to depend on ABL for retrieving the desired data. And there we are having to work hobbled because there are features available in SQL queries that are not available to ABL such as no-index reads and query optimizers. Why should ABL be a second class citizen?
As to the zip thing, I think that is another problem appropriately solved by third party tools.
There are also some of these points which are well solved by packages written in ABL which make using the technology easier. PDF Include is a good example. It allows someone to use the technology in a simple way without having to learn all the details. I see no reason why that should be built into the language. There are too many keywords as it is.
Thank you for the reply Mike,
I have noticed the Crystal integration in AutoEdge and it is a very welcomed change that Progress is taking to explore that side of the application.
It is certainly something that I felt has almost been ignored in ADM2, then in Dynamics and largely out sourced out of 4GL.
Personally, I don't really believe in using BE's as data sources/components for reports. Simply because reporting and transaction processing applications have very different needs.
Among a few reasons are that -
It's not really as efficient and as easy to combine several BE's (similar to joins) as the data source for a report.
When thinking about analysing data denormalized data is mostly better suited, just one big resultset, because it's easier to sort, group etc. Where as in updating data normalized data is more suitable so different entities can be updated separately.
SQL VIEWs are in most cases a much more ideal replacement.
And then there's the database and query features in ABL, or lack of.
You can't have a report, big, small, almost any type of report without a query optimizer but that is also arguable for most queries.
There's the 4GL remote connection implementation. Lack of fundamental features in queries, specifically, CAN-FIND and BREAK BY etc.
It's not that I have a problem with the mostly transaction processing been done in ABL and reporting and BI mostly done in SQL separation.
But I also do not think that it can't or shouldn't be done in ABL, atleast the query and the presentation in some other way, for example, Excel reporting features.
Of course not every report or queries that's abit more complicated is or can be done in SQL or with an external tool.
What I've been arguing is that it's all tide together even back to the fetchWhere( ).
I definitely think that it is crucial that reporting and BI get alot more attention in the sample applications.
Reporting and BI are just as big if not a bigger part of the application.
Just my 2 cents.
But, there are times when getting the data directly
via SQL doesn't actually work. It isn't common or
typical, but it happens. For this the dataset
approach or some kind of interface is needed and the
dataset has to be created in ABL. There are also
query requirements where the output is on screen and
there is no real role for a third party reporting
tool. There one needs to depend on ABL for
retrieving the desired data. And there we are having
to work hobbled because there are features available
in SQL queries that are not available to ABL such as
no-index reads and query optimizers. Why should ABL
be a second class citizen?
Sure, and I understand that. When I was an AP we used to write reports in ABL because that was either the quickest way to get the data, or some needed form of query SQL or a 3rd Party tool couldn't achieve, or as a good revenue stream
If the frustration is that there isn't a whizz bang reporting tool in the box, I see the point. If it's because we're not providing good samples and examples of how to do reporting in an OERA world, then I can take it as a requirement. If it's a question of the practical experience of people building apps then I question the conclusion
.....
Reporting and BI are just as big if not a bigger part
of the application.
Just my 2 cents.
And a valuable 2 cents they are My posting above crossed your reply, so you have clarified some of my questions.
In terms of SQL Views, can the same effect not be achieved by a denormalized dataset?
Thank you Mike, and thank you for your patience
I don't know if they are valueable
I feel that BE's are more comparable with stored procedures, for reports.
Because they're mostly stand alone. I don't think that you can join (as in a query) BE's as efficiently and as easily as tables.
For example, joining an Orderline and Item BE's would most likely be less efficient and harder to do then an Orderline and Item table join.
For one you could possibly be creating many many temp-table records that eventually wouldn't pass the criteria or joins.
I don't know if you can have one implementation that's both optimized for reporting and transaction processing ? I mean, they're very different.
Maybe at the current situation a separate SQL reporting and BI solution separate then the transaction processing architecture would be more optimal ?
Again thanks for the patience
Thank you Mike, and thank you for your patience
I don't know if they are valueable
Patience not needed, feedback definitely is, so yes valuable.
>....
I don't know if you can have one implementation
that's both optimized for reporting and transaction
processing ? I mean, they're very different.
Ok, now I'm on the same page. This is a valid point and one that, your right, probably needs some considered thought. I could argue that we've first been trying to get people to just think about the OERA to get data in, once we have people doing that, then we can consider how you'd get it back out But that probably wouldn't wash !
There is of course the whole notion that you shouldn't be doing major reporting against the transaction db anyway, and that you should move the data off to a reporting db or data warehouse for most if not all but the trivial.
But yes, I agree in so much as this is another great area for us to dig our teeth into and give it some thought. It might be worth starting another thread on reporting 'best practice', and how people are approaching and achieving it today, and how they see it working in a more loosely coupled distributed architecture.
Why not have a BE which corresponds to the report or a group of report. Who ever said that a BE can't be complex. One can have table level objects, but certainly even a single Order is many tables, so why not an Order Set?
Not reason you can't populate it as efficiently as the ABL will let you.
Frankly, this kind of construct is essential for some kinds of transaction processing as well.
I've always thought that this
was a silly notion, except for extreme cases. Unless you are
replicating transaction by transaction ... which may well be a good
idea for disaster recovery and such ... only on the live database
will you have all the most current data. We just need to make sure
that one can get the data out efficiently.
There is of course the whole notion that you
shouldn't be doing major reporting against the
transaction db anyway, and that you should move the
data off to a reporting db or data warehouse for most
if not all but the trivial.
I think you're referring to "business intelligence" instead of complex, but flexible page-based reporting. For the latter think about printing an invoice (this can of course be a paperless print nowadays). How do you want to do that with a replicated database?
The reporting product "List & Label" has a nice approach: the application can provide the meta data, so the report can be run against any XML. We use this approach for the more complex page based reports, like invoices, order confirmation, etc.
Than there are the straight forward reports. Nowadays you can do a lot with HTML for those simpler views. But a picklist for the warehouse can be relatively simple to print as well.
Finally there is the management view: heavy data mining on lots of tables with a digital dashboard as the end result for instance. Here you probably need somekind of transformation service to copy the data out to a dedicated data warehouse.
I could argue that we've first
been trying to get people to just think about the
OERA to get data in, once we have people doing that,
then we can consider how you'd get it back out
What's wrong with implementing dedicated "reporting services" that return denormalized and readonly data? The tricky part is batching of the data in my opinion, but that problem exists for any "search request" you implement at the user interface level as well.
You shouldn't make the mistake to solve all your problems with entities and tasks and what have you. This is a different aspect of the application, so it requires a different approach.
In regards to joining one BE to other BE's, similar to joining tables in a query.
If the end result is the dataset created in the BE as you would if the BE was used as a standalone BE I don't see any problem. And there are definitely cases where a query and VIEW's wouldn't be suitable, like, recursive queries, for example, with product structures.
But if the end result is a dataset taken from several BE's datasets then you would in a sense be reading the data through records you create in the dataset, and that are not necessarily going to be in the end result.
And just like in a query where not all the table records the database server reads match the criteria or joins and end up in the result (possibly thousands or even millions), not all the records created in the dataset will be.
But in this case you wouldn't have just read records in a table you would have created them, pass them around between appserver and traverse them when querying the dataset.
Using VIEW's (representing an internal view of the data) you would have no problem joining any number of VIEWS in any variation as you would tables.
Of course the query would be optimized and it would be optimized at the table level, after all it's just one big query. The join order could be changed so, for example, the table with the most substantial bracket would drive the query. etc.
Not the best explanation
Add ABL access to Views on your list of things needed to give ABL parity.
I'm not sure that I am following all your assumptions here. You talk of joining BEs ... but a BE isn't where the data access happens. If you are assuming that 1 DA object = 1 table, then yes, one would expect some inefficiencies in retrieval, but I also see no reason to make that assumption. When multiple tables are required to construct one "entity" and those tables are found in the same place, then I see no reason not to put them all in the same data source.
Let's take an example like "all orders received yesterday". And, lets assume that orders, items, and customers each come from three different databases on separate machines. I would start by expecting one data source for all of the direct order data from the order database. For some purposes, that may be all one needs. If I wanted to do an analysis on something like customer type, which was not stored in the order tables, then I would expect to do something like a lazy load in which there was a local customer data source that would take a list of customer identifiers and return a set of customers from the customer database. Likewise if I needed the actual item objects, e.g., to do an analysis on stock levels in relationship to the orders.
You're working with the internal not the physical view of the data, wouldn't you be repeating logic this way ? I mean you don't create a new BE and DAO if you need to make updates to several different BE's. I guess, you would build a Business Task that uses several BE and DAO's.
As I said, I can certainly see using a standalone BE as the data source comparable to a stored procedure. And I can certainly think of cases where a query wouldn't be suitable. For example, where a recursive run is required as in product structure trees.
But I don't think DATA-SOURCE's should be used, at all. For many of the same reasons and bringing us back to the start of this thread and the issues with even the fetchWhere( ) method.
The problems and limitations with the where clause or maybe using something more similar to parameters. Basically only having outer-joins to work with or you could later use them with an inner join but you would have many records that you created, passed, and now traverse that wouldn't show up in the query, and so on.
With SQL VIEW's you or even the user could work with the internal view of the data reuse and join them in any way as you would with tables. You would not have where clause limitations and you have far more then just inner and outer joins to work with.
There's no question of efficiency, most important there's a query optimizer. With a DAO (or the current implementation) the query or the way the dataset is filled has a fixed order. It will always fills the DAO from Customer to Orders then Items, even if the filter is only for a specific Item (assuming the implementation would allow for filters on Items).
I have to be off now, I'll try posting code samples next time for DATA-SOURCE's and query optimization.
I'm getting confused again ... I'm not at all sure that we are using a common vocabulary.
No, I am not suggesting willy-nilly making up new data access objects for every purpose and yes, I think they should be widely reused. I might however, have an object with two "modes" -- getting individual entities or getting sets of entities and it is even possible that I might permit a certain amount of duplication when there are fundamentally different ways of accessing the same set of tables, e.g., selecting a group of orders based on characteristics of the order (e.g., placed yesterday) versus selecting a set of order lines with associated order info based on characteristics of the line (all orders containing item X).
A lot of variations can be built in to a single object, but I'm not going to cry if it turns out I have to have a small amount of duplication in order to be efficient and natural in use.
In the earlier description, I forgot to mention that there is an aspect of lazy load to what I am suggesting. This keeps one from requesting item info triggered by retrieving each line. Instead, one gets the item info as a set when all the lines are loaded, thus not duplicating any requests.
I'm getting confused again ... I'm not at all sure that we are using a common
vocabulary.
I kind of feel the same way this is all mostly new to me.
I've got a great example on how to optimize 4GL queries that I've been using since V6 and maybe how it could be used with fetchWhere( ), that I'll try posting tomorrow.
It would also be a great opportunity to talk about the query optimizer I'm working on.
If I understand correctly the idea you suggested is -
Having a number of methods, generally, for different execution plans.
For example, fetchDataByOrder( ) that starts the dataset fill from the Orders table, fetchDataByItem( ) that starts from the Item table and so on.
Or, generally, a method for every table (which could also be a query) driving the dataset fill.
The problems I have with it especially for reporting are -
Okay you've got different ways to execute the dataset fill optimized for different criteria. A method to fetch data by order criteria, a method to fetch data by item criteria and so on.
But theres no optimizer that decides which plan or method would be better/faster, they have to be specified explicitly or which method to use.
Obviously when a an end-user, secretaries, shop floor or inventory workers etc. enter a filter for a report you wouldn't want to ask them if it would be better to fill the dataset by the orders, items or possibly many other possible ways.
Another problem that pops to mind is if we were to go with or use the DATA-SOURCE object it would generally mean only being able to have a filter on the first DATA-SOURCE. For example, a filter only on orders or only on items.
When a user enters a report filter you wouldn't want to limit him to enter a filter either only by order, only by item or some other group of information. Most of the times report filters involve a cross section of multiple groups of information, for example, orders sold in the past year from item group or type X.
For that reason alone, I think, DATA-SOURCE has no use.
Instead one big query can be used (which could, possibly, involve sub queries, procedures, functions etc.). CAN-FIND to check if a temp-table record has already been created. Another option is that a sorted query is used to know when to create records, especially, if a sorted query is needed for aggregates etc.
Besides, I think, it overly complicate and messes up the concept. BE and DAO, that's enough.
I
don't see that this follows from anything before it. Are you
talking about the data source in a PDS or the data source object
type as used in AutoEdge? Have you looked at the potential/proposed
shift in the handling of the latter that shows up in the new OO
OERA stuff? If we are talking about the object type, the reason to
use it is to isolate one from where the data is actually coming
from. It it is local, then it has to be the one that knows about
how to structure the retrieval. If it is remote, then it hands off
the hints.
> No, in lieu of an optimizer, I would expect the programmer to make this choice
A report with "orders sold this year from item type X" filter comes in.
You've got fetchDataByOrder( ) and fecthDataByItem( ).
How would the program know which one would be better/faster ?
> I don't see that this follows from anything before it.
It basically means being able to put a filter on only the first DATA-SOURCE filled in the DATASET.
I've got a dataset with an order temp-table and orderline temp-table.
Please show me a code snippet that will fill the dataset with only order.custnum = X and orderline.itemnum = Y.
Well, I haven't convinced
myself pro or con with respect to PDS yet, so I don't suppose I am
the right person to respond to that ...
In regrads to query optimization.
The 4GL uses a syntactical (rules based) query optimizer where the indexes can be changed (multiple indexes can be used) but the join order is static.
Changing the indexes or multiple indexes won't do alot of good in many many real world queries.
For example, let's take following query -
FOR EACH order, EACH orderline OF order, EACH item OF orderline.
Let's say we've got 1 million order records (including historical records), 10 million orderline records and a 1000 item records.
If a WHERE clause for the item buffer is added, for example, item.itemnum = 1 it would still mean traversing millions and millions of records, actually the whole query, even if it only returns 5.
In regards to the DATA-SOURCE objects.
Without being able to deal with a fill for an order and orderline dataset it is useless.
And, I think, we would be going in the same way we have with ADM. At the end of the day it would be too rigid and unrealistic, not to mention needlessly too complicated.
This would not be an issue if one order, orderline query would be used. Not to mention much simpler for people to understand and use, I think. Even if it's a great idea, I believe, if it's not simple people won't use it, I believe.
Which is why I think this type of query needs to be
recognized as being of an inverted type and handled differently.
The clue here is that a constraint on item is going to be very
strong compared with most constraints on order and, presumably, one
has order lines indexed by item, so one shouldn't use that for each
to find these records. It is clearly not sufficient to change the
where clause under an otherwise static query.
Back from a night out
Are you still holding to the position that there can be one implementation that is suitable for both transaction processing and reporting ?
And do you not think that in most cases it would be better to use SQL view's, stored procedures etc. instead of BE's ?
In regrads to query optimization.
The 4GL uses a syntactical (rules based) query
optimizer where the indexes can be changed (multiple
indexes can be used) but the join order is static.
I think the point with the query optimizer is the possibility to add indexes at deployment when the situation asks for it. A traditional 4GL program needs recompilation to pick up the new situation, potentially reprogramming. With a query optimizer the program can ran as is and it will benefit from the new situation. But we're not that far yet, since you can't add an index without affecting the CRC of r-code (welcome to 2007
In regards to the DATA-SOURCE objects.
Without being able to deal with a fill for an order
and orderline dataset it is useless.
I think you need to seperate the architectural concept of a Data Source Object from the language feature of DATA-SOURCE. There is nothing to stop you having your own custom query code to perform the query & fill within a DSO using the BEFORE-FILL event on a dataset as shown in one of Johns original papers (http://www.psdn.com/library/entry.jspa?externalID=1175&categoryID=293). Using this technique you have full control over what you populate, how you filter etc. So as I say, I think you need to seperate the language feature from the Architectural concept.
I think the point with the query optimizer is the possibility to add indexes at deployment when the situation asks for it.
That certainly isn't the point of a cost-based query optimizer.
A traditional 4GL program needs recompilation to pick up the new situation, potentially reprogramming.
Usually when the criteria is dynamic and especially so a more suitable index/es matching the query filter will be used, a dynamic query is used.
Obviously there will be no need to recompile if indexes or even fields are added to the table. This has been available since v9.
Let's leave the data source object, in the data access object architectural concept aside for now. Specifically about the language element.
The question still stands what's the use of the DATA-SOURCE object in ABL if it can't do the most basic things ?
I'd like to see what's the use of it ? that's the argument I would be looking for.
There is nothing to stop you having your own custom query code to perform the query & fill within a DSO using the BEFORE-FILL event
I'm aware that you can. But if it's not needed why add more stuff to make it work exactly as it does without ?
Wasn't I the one who was suggesting that reporting was best done by
a third party reporting tool, i.e., via SQL? Am I not also the one
who has been talking about wanting to be able to have SQL data
sources in the context of an ABL application in order to deal
better with set-oriented operations?
I'd like to go one step further in this separation of architectural concept and the language feature and that is to step back from the implementation while we are talking about the concept. Whether one can or cannot implement using a PDS and whether or not that is desirable, efficient, elegant, or whatever is a choice we can make after we have decided on the requirements. If the problem exists, no matter what form of ABL we use to implement, then it is clearly a problem.
Ok, at this point, and since we're almost 5 pages down a thread, it's probably worth while starting a new thread(s) with the specific question(s) in hand. So is the question about query optimization, and what ABL does, or does not do to support it, is it to do with reporting, and whether one 'object' is suitable for both reporting and transaction handling, or something else as I must admit I'm now starting to get a little lost. (Which given that it's friday afternoon, is perfectly understandable )
So, are you going to start that thread?
I expect a check back through this thread might suggest more than one fresh thread ... although it appears that Alon has already started another flurry on his thread!
I'd definitely be interested in starting a thread on query optimization ... soon
I've been writing code using OERA for a few years now, and still can't quite figure out what piece runs on the DB Server, vs the AppServer, vs the Web Client. And if not in a web client environment...? I'm still fishing around for the docs since we don't seem to have any here in spite of my numerous requests, and the various links in these old threads don't seem to work anymore. Help? I think I need some visual aids here.