Use this thread to discuss anything about AutoEdge. Design decisions, implementation choices, things you liked, things you'd do differently.
Also, feel free to start a new thread if you think the topic is big enough to stand on it's own. Just put "AutoEdge" in the subject so we can easily spot them.
Having gotten the code installed, I saw references like this:
.
Why the period after the include spec?
Another thing - this code appears to have tabs in it as opposed to being all space-aligned.
What tab stops was used when writing this?
I am curious about the design choice which leads to a PROPATH of
@ROOT\autoedge\src\server
@ROOT\autoedge\src\server\oeri
@ROOT\autoedge\src\gui
@ROOT\autoedge\gui\oeri
@ROOT\autoedge\gui\pure4gltv
....
and so forth, quite a few directories and a fair number of them nestled within themselves.
My own inclination would have been to have at most one base directory per UI plus one shared and to have all paths within the code relative to those bases. I.e., within server, you have a bunch of directories. The paths in the code to all of the programs in all of those directories except oeri are preceded by the directory name. Why make oeri different. Likewise for gui where two of the directories are in the propath directly and the rest aren't.
The paths in the code to all of the
programs in all of those directories except oeri
are preceded by the directory name. Why make oeri
different. Likewise for gui where two of the
directories are in the propath directly and the rest
aren't.
You mean like
autoedge\src\gui\oeri
?
I am guessing that the program naming convention used in AutoEdge, i.e. beXXX, daXXX, dsXXX, etc. derives from or is the basis from which the naming convention in the OERA white papers.
With the application now out the door, how do you feel about this naming convention?
I ask because my own inclination is to name so that things are grouped by the entity, not by the layer. Also, I would tend to create subdirectories for each entity. E.g. server\autoedge has 88 files in it, despite a relatively simple schema. For an ERP application, there would be thousands of files in this one directory. Thus, I would have been inclined to put all of the files relating to car in the autoedge\car directory. I am currently inclined to Car.cls, CarFinder.cls, and CarMapper.cls, or, in your case CarBE.p, CarDA.p, CarDS.p, but even if you use the prefix standard, there would be a limited number of files in the autoedge\car directory.
Within autoedge\src\server, there are directories autoedge, crystal, oeri, oerpcore, proxy, services, setup, sonic, test, and webservices. The propath is to the server directory and to the oeri directory within server. in the case of autoedge\src\gui, there are directories for autoedge, controls, img, oeri, pure4gltv, and services and the propath is to the gui directory and also to the oeri and pure4gltv directories within gui. Seems inconsistent.
Not having been involved in the design myself, I'm guessing the main reason for this approach was to be able to keep the "framework" things separate from the application itself.
"framework" things including:
- oeri - OpenEdge Reference Implementation
- pure4gltv - a pure 4GL/ABL Tree View for GUI
- escript - the eScript WebObject)
As a learning tool, it's very useful to keep such things separate so that the development community could take the "framework" things and modify them for their own use.
It is expected that people will modify certain programs. Such changes should actually be made in the \src directory. For people who do this, the propath could well end up being twice as long.
I don't think that such a long propath is really good practice for a deployment environment, but then there are other possibilities for deployment. One could put all the .r's in a single directory or could put them into a procedure library (that might be memory-mapped on the server side).
I think there are a number of valid choices here. I see your point about separating the business entities into their own directories. From a development perspective, you could use the Eclipse "Working Sets" to make sure you only see the files related to your single business entity.
With the current implementation some files (temp-table.i, xxSupport.i) are used by multiple business entities. One side effect of the current directory structure is that you don't have to decide which single business entity these files belong to.
I know, this isn't how you understand encapsulation and some of this might go away with a class based implementation, but I could still imagine that some objects might actually benefit from putting their temp-table definitions in an include file. (I'm relatively new to OO. Perhaps this is a debate for another thread)
One thing I wouldn't want to do is to confuse the server side components with the client side components. Keeping them separate makes it a whole lot easier at deployment time. That said, tools like RoundTable can help to make sure that the right components get deployed to the right environments for runtime.
I'm guessing the main reason for this approach was to be able to keep the "framework" things separate from the application itself.
A fine notion. But, if the path reference in the code is to "oeri/neatThing.p" instead of simply "neatThing.p", then the additional PROPATH entry pointing to oeri is unnecessary.
If anything, I would rearrange so that we had the UI specific stuff under one base directory, the actual application code under a second base directory, and the framework stuff under a third base directory. These three base directories would be the PROPATH and all files would be in subdirectories under one of these base directories and the name of the subdirectory would be in the path reference in the code.
FWIW, I would also divide up the application code this way by entity, i.e., all of the Car related stuff in a Car subdirectory.
I am all in favor of separation. In my own application suite, there are rarely more than 10 or 20 files in a given subdirectory. This doesn't preclude functions which utilize files from multiple subdirectories. For example, in this application suite ... which is .p based ... there is a subdirectory for order processing. Within that there is a directory which is the base for the order entry function. That function uses code in several library subdirectories, e.g., op/orde/ordepo.p might reference op/olib/something.p or op/alib/something.p where op/olib is a set of general routines which apply to orders and op/alib is a set of routines which relate to allocation. Both libraries are referenced in a number of functions in multiple subdirectories.
And, as I say, I have no objection to having more than one base directory, e.g., for application versus framework, but I don't think one should have both the base directory and a subdirectory within that base directory both in the PROPATH.
Note, for example, that something in the oeri subdirectory can be found either by referring to oeri/neatThing.p or just neatThing.p ... what an interesting wrinkle that must be for doing call analysis!
Ok, As someone who was very much involved in the decisions, let me try and explain some of the thinking.
Firstly lets state up front that one of the aims of AutoEdge is as a teaching aid, and as such doesn't always reflect what one might do in the 'real' world. The propath is one of these areas, there are others. For example, when it comes to managing error messages, we show two approaches to achieve the same result, simply to show that there are multiple ways to achieve the same thing. You'd like to think that in a real application, you do it one way, consistently!
Jamie is right when he says one of the reasons for the propath structure was to seperate the architecture code from the application code. A second choice was that we wanted to try and make it clear what code came from the original 'Implementing the OERA' work and to keep that as close to the orginal download as possible, to also then make it easier for people to see what we'd changed in relation to that code. This explains the ../oeri portion of the propath.
This thinking was then applied to other areas, so for example you will see if you look at the HTML based code, that we used the eScript web object to handle the merging of the HTML and ABL code. We didn't want to modify the eScript code in case there were future updates, so again this forced a certain propath structure. This, as Jamies rightly points out also applies to such things are the 4GL Treeview etc.
In addition, we wanted to keep the code for the vaious client types seperate from the server, so again hence the propath that you see.
So in summary, is the Propath a little quirky compared to what you'd do for 'real', probably, but in my 16+ years working with Progress, I can promise you I've also seen worse in the 'real world'
I appreciate the pedagogic value of showing more than one way to do something. I am less sure that I would actually deploy competing methods in even a sample application. Instead, I would tend to create a white paper with code which showed both methods and then pick one as the superior method and use it in the application.
I have spent some time looking at proException.p and expect to comment on it one of these days soon. Can you point me to the core of the other method?
I still don't buy the structure of the PROPATH. If you want to leave certain pieces clearly separate, then do something like
src/framework/oeri
src/framework/pure4gltv
to clearly separate out parts of the application that come from outside and which might be upgraded. But, put the PROPATH to src/framework and reference oeri/neatThing.p in the code. If you can't do this for some reason, then at the very least make it src/oeri and src/pure4gltv so that you don't have a PROPATH which has nested directories. With nested directories you have the potential for ambiguous references ... ugly, even if it has no practical consequence.
So, I can see a PROPATH which was
src/gui
src/server
src/framework
or something to that effect, but not this nesting. I think this provides all the separation and isolation that one needs without having such a horrific PROPATH.
I can promise you I've also seen worse in the 'real world'
I'm sure you have ... but I think that one should recognize that sample code coming from PSC gets looked at as a guide to how to do things ... even when it is small isolated code fragments in a manual. How much more will AutoEdge be looked at this way because it is such a rich example ... clearly lots of thought has gone into it, so it must be a good example, mustn't it?
I have spent some time looking at proException.p and
expect to comment on it one of these days soon. Can
you point me to the core of the other method?
It's actually mentioned in the 'Design Decisions' section at the end of the Business Entity Paper (http://www.psdn.com/library/entry.jspa?externalID=1444&categoryID=302).
Thanks ... I'll take a look.
Well, I guess I would say, positive points for including it in the discussion, but negative points for including it in the code. I think that the sentence from that doc which says "The standard and preferred way is to use the addError() routine" pretty well says it all...
Well, I guess I would say, positive points for
including it in the discussion, but negative points
for including it in the code. I think that the
sentence from that doc which says "The standard and
preferred way is to use the addError() routine"
pretty well says it all...
Well, as I said earlier, the aim is to show mutiple ways, but then if we feel ome just makes more sense, then point that out. Not everyone may have implemented a form of formal error handling in their application, and so the alternative approach maybe a valid choice in that situation. The main thing is it's documented
Myself, I would focus on showing them how they could bolt on formal error handling to an application that doesn't have it now and gradually utilize it as opportunity presents.
Well, speaking of the error handling, let's talk about proException.p a little.
And, let me be frank up front about my aversion to the preprocessor, even though there are places where I have been forced to use it myself as the lesser of evils.
One piece I am curious about is why you have rather slavishly copied the Java exception handling? To be sure, it is the obvious model, but I would have thought that the history of ABL error handling was focused on the block and so it might have incented you to think in terms of block structured exception handling.
Something makes me nervous about raising STOP all the time ...
Any reason you did not provide temp-tables so that one could stack multiple errors?
It seems like there are an awful lot of entry points ...
Let me guess that this one preceded 10.1A?
PROCEDURE getUUID :
/*
Purpose: Generates a Universally Unique Identifier.
Parameters: OUTPUT pcUUID
Notes:
*/
DEFINE OUTPUT PARAMETER pcUUID AS CHARACTER NO-UNDO.
pcUUID = GUID( GENERATE-UUID ).
END PROCEDURE .
The naming convention used in AutoEdge for program units, variables, and such seems to derive at least in part from the one used in John Sadd's whitepapers. Two questions:
1. Has this standard been codified in a document so that one can look at it rather than having to deduce it?
2. From an earlier exchange I got some indication that perhaps, now that AutoEdge was done, there might be some second thoughts about the standard, e.g., beXXX causes all be units to sort together rather than the various units associated with one entity. Is this more than a niggling doubt in the minds of isolated developers? Does anyone have a new set of proposals?
After some distraction, I am now back ready to do a bit of code reading. Is there an interest and willingness on the part of PSC to engage in a bit of "code review"? I.e., to discuss why certain design choices were made, what alternatives might have been considered and discarded, whether a particular alternative was considered, what advantages and disadvantages one might see, etc.
I think this would be illuminating to a lot of us.
1. Has this standard been codified in a document so
that one can look at it rather than having to deduce
it?
There is a programming standards document available within the AutoEdge documentation in the Documentation > Project directory.
Let me guess that this one preceded 10.1A?
PROCEDURE getUUID :
/*
Purpose: Generates a Universally Unique
e Identifier.
Parameters: OUTPUT pcUUID
Notes:
*/
DEFINE OUTPUT PARAMETER pcUUID AS CHARACTER
NO-UNDO.
pcUUID = GUID( GENERATE-UUID ).
END PROCEDURE .
No this is not pre-10.1A. It uses the new 10.1A language features to generate a global UUID.
Got it. Thanks.
What about the "If we had it to do over again" aspect?
My point is that with 10.1A this becomes trivial and a language built-in so I was guessing that the procedure preceded 10.1A and originally contained some workaround code to imitate what the built-in now does. Then, when you got the built-in, it replaced the body of the procedure.
Nothing significant. This is exactly the way I like to code when I expect something to be coming down the line so that minimal changes are necessary to switch over. I was just amused by how trivial the procedure became.
Personally, I would always wrap a GUUID() in a function or procedure all - that way if there's anything else that needs to be done at the same time the ID is obtained, it can be.
OK, I've been through the standards doc. My compliments that one actually exists and even seems to have been followed. I was originally thinking to have some give and take here, although I was first hoping to get the AutoEdge folks to make some comments on what they might decide to do differently, but it seems to me that the initial response might be a bit lengthy. What about the idea of converting it to a PDF and then I could add annotations and post both versions on OE Hive? I think that would be interesting.
There is a lot of it that is minor questions of taste, but a few points which I think are more substantive and "chewy", i.e., actually worth discussion.
Got it. Thanks.
What about the "If we had it to do over again" aspect?
We're tried to address some of the 'If we had to..' in the 'Lessons Learned' sections in the supporting sample documentation which can be found in each of the OERA sub-categories under the AutoEdge category (http://www.psdn.com/library/kbcategory.jspa?categoryID=298).
This is not in the download materials?
I'm not readily finding it at that link?
While I am interested in the question generally, this specific one was about the standards.
My point is that with 10.1A this becomes trivial and
a language built-in so I was guessing that the
procedure preceded 10.1A and originally contained
some workaround code to imitate what the built-in now
does. Then, when you got the built-in, it replaced
the body of the procedure.
That is exactly what happened. In the first version (10.0), this procedure generated a long string out of things like date and time, session-handle, IP address and lots of random characters to create a "unique" ID.
In a second iteration, we had this procedure call an external DLL to get a UUID.
When the UUIDs became available in the language, we replaced all that stuff with this very simple call. And while it now looks strange to create a wrapper around a single line of code, it made a lot of sense in terms of maintainability back then, and probably is still a good idea now just in case you would ever want to replace this function. Not that I can think of anything more unique than a UUID :P
That's what I figured. It is certainly a good design approach for minimizing disruption as new language features are added. My favorite antique example of this in my own code was writing a series of routines that presented various kinds of dialog boxes ... in version 5 or something.
We have expanded the AutoEdge aspect on OpenEdge Hive a little bit. The main page is here http://www.oehive.org/AutoEdge . In addition to the previous issue tracker, we have added a section on project documentation to provide commentary. You will find the first of these, an annotated version of the Programming and Naming Conventions document here http://www.oehive.org/node/656 .
Looking over some of the sample code for xxCar.*, I run into a couple of questions about design choices (other than the lack of .cls, of course! ).
In becar.p:
There are two methods, eCarCreatePreTrans and eCarModifyPreTrans. I presume these are separate so that there is a consistent signature pattern for all beXXX.p and in case there needs to be some difference between them, but in this instance the two methods are identical. Would it not have been preferable for one to call the other or both to call some common code?
In those same procedures, there is a call to eCarValidation, which is a procedure in becar.p. Why wouldn't this be a method of dacar.p?
In those same procedures, there is a call to getCarDesc, which is in the dacar.p superprocedure. Why wouldn't this already be resolved, i.e., de-normalized by the fetch in dacar.p?
more later ...
I agree with this approach. I think it generally is a good idea to try to minimize 'change effects' by abstracting these type of utility statements and wrapping them in self made code. So even if we would have started with OpenEdge 10.1A or later, the GUUID() call would be wrapped.
BUG:
in src/server/autoedge/becustomer.p:
ASSIGN
dLicDate = DATE( MONTH(eCustomer.CustomerBirthDate),
DAY(eCustomer.CustomerBirthDate),
YEAR(eCustomer.CustomerBirthDate) + dLicYear)
.
will not work for people born on Feb 29.
I've been in other shops that made this mistake, and in one of them it cost ~$20K to fix, plus the reputation hit in the marketplace.
BUG:
in src/server/autoedge/becustomer.p:
ASSIGN
dLicDate = DATE(
MONTH(eCustomer.CustomerBirthDate),
DAY(eCustomer.CustomerBirthDate),
YEAR(eCustomer.CustomerBirthDate) + dLicYear)
.
will not work for people born on Feb 29.
I've been in other shops that made this mistake, and
in one of them it cost ~$20K to fix, plus the
reputation hit in the marketplace.
Of course we left such features in the code to see who would be really paying attention
But good catch. Feel free to post a fixed version
But good catch. Feel free to post a fixed version
I'm doing a code review. Fixing the problem is someone else's responsibility.
A couple of things I have noticed in terms of the standards document versus the dictionary which I am wondering if someone would like to comment on.
1. The standards document indicates that the primary key of every table should be of type integer, but the actual primary keys in the autoedgexx databases are of type character with a format of X(64). Without even looking at the code I can guess that this means the use of a GUID instead of an integer, which seems fine to me, but perhaps is an indication that the standards document should be revised.
2. The standards document indicates that fields which are foreign keys to another table should have the same name at the key of that table, i.e., so an OF join would work. However, the use of the BaseCode table to serve as a generic holder of codes and descriptions implies that one could not have two such codes in the same table and still keep this naming standard. Consequently, we have Car with BaseCodeStyleID and BaseCodeColorID, among others. This seems perfectly clear, but again it seems that one should either annotate the standards document to indicate this exception or consider some other change.
Just curious.
Just a thought, but I think if I were to going to revise this application, one of the things I would consider doing is creating a table for the pool of test drive cars which was separate from the pool of the dealer's inventory. Realistically, I believe that a dealer would have only a limited number of cars out of the inventory designated for giving test drives. So, if someone was interested in a blue XYZ, they might get their test drive in a green one, but be able to walk out on the lot and see the blue one. Functions like ASN for incoming cars is definitely related to full inventory, not just cars associated with test drives. Also, there are probably properties of test drive cars that one might want to track, like total miles driven. I would imagine that dealers had rules about how many miles they want to put on a car before it is no longer saleable as new. So, they might put a car into the test drive pool until that limit is approached and then remove it from the pool and/or flag a car with excessive miles that it can no longer be sold as new.
A couple of things I have noticed in terms of the
standards document versus the dictionary which I am
wondering if someone would like to comment on.
1. The standards document indicates that the primary
key of every table should be of type integer, but the
actual primary keys in the autoedgexx databases are
of type character with a format of X(64). Without
even looking at the code I can guess that this means
the use of a GUID instead of an integer, which seems
fine to me, but perhaps is an indication that the
standards document should be revised.
Your guess is correct, we moved to using GUID's during the project so that is probably an oversight in terms of updating the doc.
>..... Also, there are probably
properties of test drive cars that one might want to
track, like total miles driven. I would imagine
that dealers had rules about how many miles they
want to put on a car before it is no longer saleable
as new. So, they might put a car into the test
drive pool until that limit is approached and then
remove it from the pool and/or flag a car with
excessive miles that it can no longer be sold as new.
Sure, but don't forget this is only an example app, we're not looking to sell it However, we do have a little flag on the inventory that can be set/un-set to mark a car as available for a test drive or not. Simple I know.
Mike, I appreciate that it is only an example, but one of the things which bugs me persistently about most example code is when it is too simple. AutoEdge on the whole certainly isn't that whereas sports was. But, I see no reason not to craft an example with the kind of structures that one would need if one were to flesh it out in to a full application. Aren't you suggesting that people make additions? Well, suppose someone wanted to create a demo inventory module? They would be faced with having to change the dictionary to get a suitable structure, not just additional fields, but changing the structure, thus requiring other parts of the app to be modified in order to accommodate the change. I think it is possible to create data structures which anticipate growth and that this itself is a good lesson in best practice.
E.g., there is one contact in the dealer. Now, it is reasonable to designate a principal contact in the dealer, but there is certainly more than one person in the dealership that one might wish to contact at various times. One should have a structure which allows associating multiple employees with the dealership and flagging one of these as principal.
E.g., there is one contact in the dealer. Now, it is
reasonable to designate a principal contact in the
dealer, but there is certainly more than one person
in the dealership that one might wish to contact at
various times. One should have a structure which
allows associating multiple employees with the
dealership and flagging one of these as principal.
Even though we may not have a 'Principal' flag, there are address details for the dealership, but then there are multiple employees per dealer.
The question to all these details is how far do you go, before you stop being an example for discussion and knowledge purposes, as opposed to building a complete featured app. It's a balance, and I hope in the main we've got the balance about right.
Well, to my taste I would have gone a bit further. Like the inventory thing ... I would have made it two tables. The two tables would have have all of the fields in them that a full-fledged application would have, but the basic structure is there so that, should someone want to do an enhancement, one could merely add a field or two and move forward without having to change anything that is already there. I think most of this is up front work on the schema. One could have a significantly more tables in the schema without making the application significantly more difficult to create, but the resulting application would more closely resemble the structure of a real application.
Well, to my taste I would have gone a bit further.
Come on Tom, first we gripe that there's no showcase app for PSC technologies, and then they come out with one - and they didn't go far enough?
That's what happens in real life. Decisions are made, the program is developed to solve a particular set of problems, and then put out to the field.
The next thing you know, the users want upgrades.
I haven't looked at Autoedge as a "real" app, but one that demonstrates their various technologies in a realistic setting.
If it was a "real" app, it should have the same kind of artifacts real apps have - like spaghetti code, code done in accordance to limitations of past versions, and stuff like that.
If they're going to add the stuff you're looking for in order to become a "real' app, I vote they write some spaghetti in as well.
Yeah, I know ... but one of my roles in life seems to be goading people to higher levels of perfection!
I know it was a joke, but I wouldn't advocate adding in spaghetti or anything bad that one runs into in real apps, but rather just some reflection of the complexity of real apps.
sports has been an embarrassment for years because the data structure was so minimal that one couldn't even write one's own demonstration or example software against it without the code looking stupid. And yet, there is a huge attraction to using it as a base because one can then just hand out code and one knows that everyone who has a Progress installation can try out the code without having to go through building and populating a database.
One of my hopes for AutoEdge is that we would get past this. Possibly AutoEdge could be bundled with future releases? Has that been considered? Or, at least, if we get to the point where someone can easily download and install AutoEdge, then one could expect to send out some sample code that ran against it. The fact that there is a bunch of interesting stuff going on makes it that much more attractive since one's example can assume that context ... although it can get one into the complexities of getting Sonic working and the like. Hopefully, though, that would only be required if the sample code one wanted to distribute actually related to the parts using Sonic.
To me, the preferred way of designing a demo application is to pretend that it is a first iteration of a real system. To be sure, in a real system as it builds the users will periodically come up with requirements which violate what seemed previously to be perfectly valid assumptions about the requirements of the data structure. But, I think that an experienced enterprise architect can start off from the beginning incorporating data structures which anticipate the kinds of requirements which may not be included in the first round, but which are certain to arise as things go on. E.g., the need to associate a set of contacts with a dealership, one of which is primary.
If one puts in some of this kind of thinking up front and creates the schema accordingly, I don't think the coding task for the simple initial version is greatly increased. But, if one decides to make a particular feature more "interesting", all that needs to change is the parts of the code which relate to the change because the schema is pre-adatpted, as it were, to accommodate more complex behavior.
I should probably also tell a little story. Many years ago, back when god was a young woman and I was a graduate student, my dissertation advisor was a fellow by the name of W.W. Howells. He just died a little less than a year ago. One of the "interesting" things about taking a course with him was the way he marked up papers. If one wrote a paper that he really liked, it would come back absolutely covered in red ink. Once one got over the shock and looked at what he wrote one would realize that it was a combination of "Good observation!", "This makes me think of XXX which you might want to look into", "Read what YYY says about this in ZZZ", and suggestions about how to improve wording and such as if he was trying to get it ready for publication and wanted it to come around to the best possible form. This was very consistent with his style of treating grad students as junior professionals, people from whom one expected professional quality work, but which needed some help and guidance. If he didn't like the paper, on the other hand, it would come back with almost nothing written on it except something like "You can do better".
I have tended to be the same way.
Point being, AutoEdge is good enough to be worth criticizing.
Interesting that all of the messages in MessageData which are linked to the code for German are simply English phases preceded by "DE - ". Did we forget to actually do the translations?
Interesting that all of the messages in MessageData
which are linked to the code for German are simply
English phases preceded by "DE - ". Did we forget to
actually do the translations?
You mean that's' not real German? I'm sure you know that we english don't tend to do other languages very well, so when talking to non-english speakers we tend to talk either slower or louder till we think we're understood! The best part, as some of you may know, is that Christian is in fact German !!
Well, then, I'm surprised all those messages didn't end with three exclamation marks!
If you want to see real German messages, run setup data and select the German XML file. And then I'm sure you see the problem we had...
The reason we have these strange messages is that in this case we were trying to show that the message translation works, not to demo it to a German customer. And surprisingly most of our developers and testers weren't fluent in German, so they didn't know if "Änderungen gespeichert" really was the correct message for saving a record. The "DE" prefix solved this problem - it's obvious that we are pulling in a different string from a different language, but the information is still there to verify that the correct string is used.
Makes sense?
Makes sense ... still a little funny, though!
After going through OERA documentation and Autoedge code, I have the following comments/questions:
- When calling fetchWhere operation, it seems that the querystring must be specified using the physical table/field name. Why is the service requester supposed to know the physical data source structure ? Since that is the Business Entity who is known to the requester, I think the requester should specify filters on fields in the BE dataset. Then the BE/DAO would translate the logical filters to a physical filters that can be used as a query for the data-source. Maybe this logical/physical translation could be done using the buffer's attribute ATTACHED-PAIRLIST.
- Why does the Car Data Access Object (daCar.p) read physical db table (see getCarBrandId, getCarModelId, validateBrand, validateModel, getCarDesc) ? It is written that DAO has not access to data source, it's the Data Source Object who has the knowledge of the physical data source structure.
- In the Data Source Object files (sce*.p), I noticed that there is some references to the BE temp-table when calling get-buffer-handle method. Isn't this violating some rules ? Does the Data Source Object should only know the physical data source ? And since the DSO is specifying the BE temp-table name, it means that this DSO cannot be reused by different DAO.
- Is OERA enforcing the following rule: every reference to a particular db table should be centralized ? ie: Fetch & update(add,modify,delete,validation) operations should be done in a single place only.
In Autoedge, I see the Car DSO read the Dealer db table... Shouldn't it calls the Dealer BE the get this info ??
But then, if every read go through the corresponding BE, this will become a performance killer. Any thoughts on this ?
Thanks,
Guillaume Morin
I think you have some good questions here. I am also a bit unhappy about the data access layer in AutoEdge ... although I have to say that it is a brave and interesting attempt!
I find the data access / data source split in AutoEdge to be both a little less intuitive than I would wish and to periodically have the kind of apparent violations you have found. Another one that bothers me is having the BE object call a method on the data access object to populate a description field which I would have had populated as a part of marshalling the "object".
I don't know that I agree that any one table should only be referenced in one place though. To me, the data access layer serves to do the object-relational mapping. So, if item description needs to be a part of order line and also a part of item, that doesn't mean that I need to get that description from an item object, I can get it straight from the database, at least since I am only reading it. Similarly, the descriptions which go with the codes in an order or order line. The ShipVia code, for example, might be referenced in a whole bunch of places with all of its properties read only except where ShipVia codes are actually being maintained.
As to the layering in the data access layer, my own inclination is to use a Finder and a Mapper in Fowler's terms. The Mapper is the only one with reference to the database. The Finder needs to establish the query, but it doesn't really need to do so in terms of the database tables. It can do so in terms of the object references. Thus, for a simple example, the database might have a field called CustNum which was a unique identifier while the object might call this CustID because we have decided that all unique identifiers should be XxxxID. The Finder would have a method for setting up the Mapper to do a unique find of an individual record, but in Finder this would be called CustID. Only in the FindUnique or SetUniqueID method of the Mapper would this get equated with CustNum.
As for the data source manipulating the BE temp-table, I haven't looked at the specific code recently, but let me say this. The relationship would probably be more clear here if we were actually dealing with OOABL instead of this .p imitation. The domain object or business entity object is really an object which is created in the data access layer, travels to the business logic layer, and is then returned to the data access layer for retention. So, the data access layer does need to know about the domain object. I don't know whether the specific case you cite is one that I would approve of, but one certainly can't say that the data access layer should have no knowledge of the business entity object. With only temp-tables being passed, this is perhaps a bit less clear than it would if there were full encapsulation in objects.
Keep the discussion going! We need more of this!
- When calling fetchWhere operation, it seems that
the querystring must be specified using the physical
table/field name. Why is the service requester
supposed to know the physical data source structure ?
Since that is the Business Entity who is known to the
requester, I think the requester should specify
filters on fields in the BE dataset. Then the BE/DAO
would translate the logical filters to a physical
filters that can be used as a query for the
data-source. Maybe this logical/physical translation
could be done using the buffer's attribute
ATTACHED-PAIRLIST.
This however is a consistent philosophy, since the same "hack" has been used in ADM, ADM2, Dynamics and whitepapers. It seems like nobody cares about this design flaw and you will soon be called a "purist" when you do
Security is another issue, since you can insert anything.
- When calling fetchWhere operation, it seems that
the querystring must be specified using the physical
table/field name. Why is the service requester
supposed to know the physical data source structure ?
Another "hidden design flaw" is the fact that the actual physical storage type is exposed everywhere. So when you store an amount value as a decimal it doesn't mean that all the business logic should deal with type "decimal" as well. When you're going to spent a lot of time redesigning your application, you probably want to wrap the raw 4GL raw types as well. But that's of course not possible when you're using temp-tables internally.
Indeed, please do keep this conversation going as it does raise some interesting questions.
The fact first of all that there is something like AutoEdge available, to even have these discussions about, I think proves the point of producing it (So that hopefully justifies some of our efforts last year!).
As for some of the specific points raised, it's a valid question about how much a client (service requester) should know about the physical tables etc in the service provider and one we should explore further.
The question about the DAO/DSO relationship is also a valid one, and I think it's fair to say this is an area in which our thoughts have evolved over the project and in the past few months. Putting aside for a second the physical implementation (i.e. whether you have a single source code file for both, or two separate ones), there is a definite logical separation. We see the DSO as being the component that has the physical knowledge of the data source, and as such it should be here where all the data logic for the physical data source resides. We see the DAO as then being the aggregator of 1 or more DSO's to build the logical data structure. The DAO will also be responsible for any logic that operates at a complete structure level, as opposed to the individual elements (temp-tables for example), such as prodatset level call-backs. (See the OERA Quick Ref Guide http://www.psdn.com/library/entry.jspa?externalID=2123&categoryID=54 which has a new component model to describe this) This separation then allows the DAO the ability to not care about the physical data source, and as such means you could have for example a 'getCar' method, which simply turns around and calls 'getCar' in a DSO, and it is the DSO that has the physical code that knows how to get the car, which in one instance maybe a simple find statement, and in a DSO that deals with a non DB data source, some form of XML processing to get the car from an XML file. Having said all that, yes, there maybe some minor indiscretions in the code today!
Also, in terms of how the components interact with each other, such as should a BE call another BE directly, we've put what we consider to be sensible guidelines down in the quick ref guide mentioned above. So if you need to for example use more than one BE to service a request, then this should be made a task, and it has the job or then doing the orchestration between the BE's. This again should allow for greater flexibility and re-use, as you not building any dependencies at the BE level, and it removes the possibility of an accidental call architecture within the Business Components layer!
One of my proposed exchange 07 sessions is a an update on all this, and how & why we've arrived at some of the decisions we've made with respect to the OERA and the interactions, relationships and characteristics of it's components. So if it gets accepted I'll expect you all to be there, and we can have a good chat
- When calling fetchWhere operation, it seems that
the querystring must be specified using the physical
table/field name. Why is the service requester
supposed to know the physical data source structure ?
Since that is the Business Entity who is known to the
requester, I think the requester should specify
filters on fields in the BE dataset. Then the BE/DAO
would translate the logical filters to a physical
filters that can be used as a query for the
data-source. Maybe this logical/physical translation
could be done using the buffer's attribute
ATTACHED-PAIRLIST.
I am currently involved in a project where exactly this question has been raised. As long as the field mappings between the logical (entity temp-tables) and physical (db) are simple, there is no problem with doing this.
Eg. Business Entity passes "WHERE OrderNum = 5" to Data Access Object. As the database table & field names are in German, this gets "translated" into "WHERE BestellNum = 5".
I think things would start to get interesting if the fields you're trying to translate are populated through an AFTER-xy-FILL event. Not only could they get difficult very quickly, they might also have a huge effect on performance if you have to do something like: fetch each order where OrderTotalValue > 1000000. If OrderTotalValue is calculated in a buffer AFTER-FILL event and you want to filter the record by deleting it again you could, but...
If we discover any nasties during the project, I'll be sure to pass the info on.
- Why does the Car Data Access Object (daCar.p) read
physical db table (see getCarBrandId, getCarModelId,
validateBrand, validateModel, getCarDesc) ? It is
written that DAO has not access to data source, it's
the Data Source Object who has the knowledge of the
physical data source structure.
The design question that I have to ask is: where is the advantage in separating out the code in the data-source program (sceXY.p) files from the data-access program (daXY.p)?
For me, the data-access program is where the OERA's Data Access Layer is implemented. By definition, that means that it's doing the mapping between the physical (db) and the logical (in this case ProDataSet) models. As such, I think it's quite okay for daXY.p to access the data-source (in this case the database) directly, though only for the database tables that physically represent the business entity in question.
I question whether any single business entity will have two different kinds of data sources in a real world application ie. partly db-table, partly XML, partly WebServer, etc. I could imagine that in such a case, a different design of the business entity might be in order (ie. break it up into more appropriate entities)
If you have an application where multiple kinds of data sources are required in the same data access object, I'd suggest that's an even bigger reason to make sure the data access object knows how to get to those things directly as it could make a big difference for performance. Indexes database access <> XML read followed by filtering <> WebService request (possibly requiring filtering) <> X-Query result set <> ...
I also question why the business entity makes calls to "RUN addDataSource IN hAccessProc", passing in hard-coded values as to which sce.p program is to be called. Even if there is a valid reason for having separate sce.p files, I think the business entity should not know anything about a data source. This is the job of the data access object.
I'm interesting in knowing what others think...
- Is OERA enforcing the following rule: every
reference to a particular db table should be
centralized ? ie: Fetch &
update(add,modify,delete,validation) operations
should be done in a single place only.
In Autoedge, I see the Car DSO read the Dealer db
table... Shouldn't it calls the Dealer BE the get
this info ??
But then, if every read go through the corresponding
BE, this will become a performance killer. Any
thoughts on this ?
OERA is of course a reference architecture. AutoEdge a reference implementation. Design decisions were made based on the business case described (see the AutoEdge documentation).
You're observation that the purist approach to accessing the content of other business entities has not always been followed in AutoEdge. In a simple example application this might not be necessary. Given that this is a simple example to show how things could be done, I do see that value in implementing a purist approach.
Of course, then you have the possibility to have one business entity call another business entity to do things like validation, etc. The question is how far do you want to take it.
In my example application, perhaps I want my different business entities running on different AppServers on separate physical machines (and naturally with separate databases). In such a situation, my business entity must call another AppServer. Of course, it would use the standard Services Interface gateway (and associated proxy) and do all the right things like pass context for the requestion, authenticate that I'm allowed to execute this request, etc, etc.
In another example application, perhaps my other business entity has an XML data-source, retrieved from Oracle through Sonic ESB, complete with XSL transformations and the works...
If we could wish for an example of doing things the "right" way, what should we wish for? Who know, perhaps Christian & Mike might even build it for us
This however is a consistent philosophy, since the
same "hack" has been used in ADM, ADM2, Dynamics and
whitepapers. It seems like nobody cares about this
design flaw and you will soon be called a "purist"
when you do
Okay, so I'm a purist
Security is another issue, since you can insert
anything.
I think it's fair to say that implementing security in the business entities was not one of the original goals of AutoEdge.
In a project I'm currently working on, we're tackling exactly this issue. It's pretty easy to have security based around which business entities may or may not be accessed by a given user, as well as which methods (ie can-read = fetchWhere, car-write = saveChanges).
Then you can think about can-read/can-write on a field level. Okay, that's not too hard - just change there where clauses for the FILL events.
You can also think about can-read/can-write on a data-level (can-read companies 5,6,8 / can-write company 6). Okay, that's not so hard either. Again, just change the where clauses for the FILL events.
However, those two together with mapping logical to physical fields and you've soon got some pretty complex query re-mapping going on. Once you've got it all working okay, you just need to pass the FILL query through your magic ABL Query Optimiser and bingo, you not only have a great solution, it's fast.
The question we're currently discussing is where is the right place to do such security filtering. Should this be done in the business entity, before the query context is passed off to the data access object, or is the data access object the right place for this?
Security is another issue, since you can insert
anything.
I think it's fair to say that implementing security
in the business entities was not one of the original
goals of AutoEdge.
I was talking about the danger of injecting something unwanted in the filter: the server seems to trust the filterstring, so it's easy to tamper with. Compare it with the SQL-injection thread.
I question whether any single business entity will
have two different kinds of data sources in a real
world application ie. partly db-table, partly XML,
partly WebServer, etc. I could imagine that in such
a case, a different design of the business entity
might be in order (ie. break it up into more
appropriate entities)
I also question the possibility in real life of swapping something in the back end without disrupting the rest of the application. There are very few applications capable of unplugging a predefined customer entity concept for example and replace it with something that comes from an external system.
To me, this seems to be a sort of pre-OO viewpoint. Once one starts to think in terms of objects which don't necessarily correspond to the stored form, it is very natural to think in terms of a strong isolation between the database form and the form anywhere else. But, prior to that, one tends easily to think in terms of temp-tables which are simply local extensions of the database table. This gets one some separation, but not real isolation since one is assuming exactly the same data types and structures in the BL layer as one has in the database.
It seems to me, that unless I have
been reading in my own interpretations, that the answer per the
OERA and all established practice about such layered application
design is that nothing above the data access layer should know
anything about the stored form of the data and, in fact, should
be totally unaware of the source of that data. I hope that not only do we get this, but we also get a
session on AutoEdge itself. One could easily fill several sessions.
To be sure, but I
would encourage discussion during the development as well as you
make design decisions.
And, of course, even if we pick at you endlessly about some aspects, there is no question that it provides a lot of very interesting and useful examples in a context where they make a lot more sense than isolated code snippets in a manual.
That AE is being picked at is a very good sign - how much worse would it be if it was so bad and useless that it was completely ignored?
I'll take passionate but constructive criticism over deadly silence any day.
I think I would take this one
step further in suggesting a Finder layer between these existing
data access layers. A Finder should be created by the object that
needs it, but the Finder should attach itself to the component
responsible for providing the object of interest. I say "attach"
because it should request a connection to an existing object or the
request should cause the instantiation of that object without the
Finder being aware which has happened. The same should happen
between this "data access" component and any source components,
i.e., it should request attachment from a service component and it
should be ignorant about whether it is receiving a previously
instantiated version or one that was instantiated for the purpose.
Note that one of the advantages of this is that one can have
multiple versions of a data source object and which type is
instantiated can be a configuration option. As a final note, I
would also like to observe that there is a certain "twist" in
design which derives from the fact that a database connection is a
property of the session. In other OO languages, one has a
connection object and a particular data source connects to a
particular instance. Thus, one can have multiple data base
connections in a single session and can connect any one particular
data source to the appropriate connection. I'd like the ability to
have a data base connection object in this way in the OOABL. In
particular, I would like to have both ABL and SQL versions so that
I could use one or the other as appropriate. Wouldn't it be nice to
include a SQL connection instead of an ABL connection when the
function of the component was reporting and one knew that a full
table scan was needed?
I think this is a
question that has three parts: 1) If "we" had it to do all over
again, what would we do differently? 2) Based on #1, what would be
the design of a fresh implementation based on .ps? 3) What would be
the design of a fresh OO implementation? I think #1 is a question
that also has two very different aspects, at the least: 1) What
architectural choices were made that on reflection should be done
differently? 2) What "scope" decisions were made that on reflection
should be broadened? In the latter category, two examples which I
recall having come up so far are: 1) The car "inventory" is
actually housed in a table for cars available for test drives and
while there is a flag to indicate what is and isn't available, this
is not the way one would design an application for growth and
expansion. Instead, one should have a separate inventory with a
breakdown by site, the inventory should have style/color/option
characteristics to recognize the relationships between related
vehicles, and the inventory should be by site. This makes the cars
eligible for test drives a simple table of pointers into the main
inventory. All this takes is a few more tables and only a very
slightly more complex access logic, but it is a structure on which
one could build. 2) Updates from the web update the HQ data which
updates the dealer data, but updates to the dealer data don't
update HQ. This leads to inconsistent data and seems to me to be a
fairly basic piece of functionality which should not have been
excluded.
Given that there might be multiple types of business
entity accessing the same data source, it seems essential that the
security be enforced at the level of the data source. This is
doubly true since things like limiting the output by company
probably involves use of appropriate indexes to make efficient
queries. Anyone who has worked with the Varnet schema in which many
tables had an "entity code" as the first field in most indexes and
has seen the full table scans which result from not specifying an
entity code because there was only one company at some particular
site is going to be painfully aware of this!
In particular, I would like to have both ABL and SQL versions so that I could use one or the other as appropriate. Wouldn't it be nice to include a SQL connection instead of an ABL connection when the function of the component was reporting and one knew that a full table scan was needed?
You wouldn't be trying to get around the Language group's failure to include NO-INDEX in the ABL?
A danger which is dramatically heightened by having components
above the level of the data source aware of the actual schema.
There may be
very few such systems today, but I see no reason why we can't have
them going forward. If you have access to the beta forum, take a
look at my posting of a code sample for method override in which
there is a customer object which can be initialized in one of three
ways: 1) Empty, such as one would use for creating a new entity; 2)
With all fields filled, such as one would use for instantiating
based on a database record; or 3) With all fields, but in the form
of XML, such as one would use for instantiating from a foreign
service. Really not much additional code required to do this.
This reminds me of my dissertation adviser,
W.W.Howells, who died a little over a year ago. One of the things
his students had trouble getting used to is that if you gave him a
paper he really liked, it would come back absolutely covered in red
ink. Once one actually paid attention to what was written, though,
one realized that all this ink was a mixture of "good point", "this
reminds me of XYZ, which you might want to look into", "this would
be clearer if you said ...", "see also the study by ...", and the
like. I.e., what he was doing was commenting on the piece in the
way that he would if he were trying to help prepare it for
publication ... making corrections where needed or useful, but a
lot of it either complementary or ideas that either might be
incorporated to make the paper even stronger or even just ideas
that one might want to pursue later. By contrast, if he didn't like
the paper it would come back with a single note at the top such as
"you can do better" or "this is not adequate". I confess to similar
practice ...
I don't know that I agree that any one table should
only be referenced in one place though. To me, the
data access layer serves to do the object-relational
mapping. So, if item description needs to be a part
of order line and also a part of item, that doesn't
mean that I need to get that description from an item
object, I can get it straight from the database, at
least since I am only reading it. Similarly, the
descriptions which go with the codes in an order or
order line. The ShipVia code, for example, might be
referenced in a whole bunch of places with all of its
properties read only except where ShipVia codes are
actually being maintained.
So, update and validation operations go through the BE, but read operations do not have to ? I have some concern about this. Why duplicating the read code when it's all already done in another BE ? This may seems overkill for a simple find, but when the read operation is getting more complex, I would prefer the re-use what's already coded in the BE who is managing the requested data.
As to the layering in the data access layer, my own
inclination is to use a Finder and a Mapper in
Fowler's terms. The Mapper is the only one with
reference to the database. The Finder needs to
establish the query, but it doesn't really need to do
so in terms of the database tables. It can do so in
terms of the object references. Thus, for a simple
example, the database might have a field called
CustNum which was a unique identifier while the
object might call this CustID because we have decided
that all unique identifiers should be XxxxID. The
Finder would have a method for setting up the Mapper
to do a unique find of an individual record, but in
Finder this would be called CustID. Only in the
FindUnique or SetUniqueID method of the Mapper would
this get equated with CustNum.
Is there some Progress sample code out there based on this "Finder and Mapper" approach ?
I have some I am
working on, but I haven't published it yet.
if you have access to the beta forum, take a look at my
posting of a code sample for method override in which
there is a customer object which can be initialized
in one of three ways:
1) Empty, such as one would use for creating a new
entity;
2) With all fields filled, such as one would use for
instantiating based on a database record; or
3) With all fields, but in the form of XML, such as
one would use for instantiating from a foreign
service. Really not much additional code required to
do this.
I was not questioning whether or not it would be possible to change the physical datasource at runtime, but I was questioning what you want to achieve with it. A lot of articles out there are using database abstraction to do a (theoretical) replacement of the physical datastore. For instance replace real datasources with mockup objects in case of unit testing business logic. There there is a lot of hassle setting up such an environment.
Now the other reason for doing this is the functional replacement of a predefined version of an entity (internal datasource) with a foreign version (external datasource). But I really question if an application is able to use the foreign version that easily. The application relies on property X of an entity and that property might not be available in the foreign entity. Sure, for specific cases you could design a solution, but that would require some more thought than just being able to "swap datasource at runtime". A simple example is the foreign key in database systems. What will happen when the primary key of a foreign customer entity doesn't match the key defined in your system. This might require some additional mapping table.
And, of course, even if we pick at you endlessly
about some aspects, there is no question that it
provides a lot of very interesting and useful
examples in a context where they make a lot more
sense than isolated code snippets in a manual.
That AE is being picked at is a very good sign - how
much worse would it be if it was so bad and useless
that it was completely ignored?
I'll take passionate but constructive criticism over
deadly silence any day.
Me too especially as this was one of our main goals of the whole project, to get people thinking and talking about this stuff !
From my ramblings above, the possible conclusion I am
coming to is that it allows the data source to be
something other than a simple read of the database.
The data access object responsible for assembling an
Order doesn't need to know whether the local copy of
the data source is a cached table from another
service or cached table from a local database or a
component that accesses the data row by row from a
local data base.
Agreed. More to follow on this in my next post...
I think the question of how you architect such a caching mechanism is a very interesting one.
In the scenario where we have a number of AppServer agents servicing clients with business entities and these business entities are based on "foreign" data (ie. we need to cache it somehow), should each Agent be keeping it's own cache or should we be sharing this cached information through an OpenEdge database that each Agent is connected to?
With 20 Agent, if each Agent keeps it's own copy of the cached information, we would potentially have 20 caches with all of the records in each cache. Even with a lazy cache population approach (only get the remote records when I need them and don't already have them), each record will be requested 20 times in the end. Of course, once all the caches are fully populated (which is rather unrealistic), performance will be great. Start another Agent though and then performance issues may appear while this Agent's cache get's populated.
And then there's the question of how each of the Agents will get information of updates to the cached information should it change. If each Agent were to receive a message (perhaps subscribe persistently to a Sonic topic) for such changes, the same message would be sent to each Agent. What happens when Agent number 21 starts though? Does it built it's own cache from the remote data source from scratch?
If I were to design such a caching mechanism, I'd probably go with one single copy of the cached data (stored in an OpenEdge database) that each of the Agents connect to. The advantages being:
- cache updates only sent once to a pool of Agents
- the number of Agents connecting to that cache is irelevant
- far more scalable solution (single persistent copy of the cache, buffered reads, etc)
- predictable performance when scaling at runtime (performance not based on which Agent services the client)
- the cache is persisted ie. no rebuild restarting the AppServer or rebooting the server
...and if you're still with me...
Now we would have a cached copy of the remote data. A cached copy in a local database accessible to each of the AppServer agents in question. Now we're back to having local copies of the data again. The Cache Manager is responsible for making sure that the cache is up to date. The business entity just uses the local data.
...and of course now we have something that has a striking similarity to replicating data from one application to the other...
I guess in such a scenario, any changes to the data would have to be sent back to the application that owns the data, which of course avoid any of the tricky collision processing required if you do two way replication.
So now that I've talked myself almost all the way around the circle, I think I could design a solution where for any given business entity, the data-source would always be local - perhaps partly though a caching mechanism - for all read events. For write events, I'd be wanting to push the appropriate pieces back to the owning application rather than committing them locally. (now, should I also commit it locally, so I don't have to wait for the change to be processed by the owning application, which then sends the updated data to my cache manager? Hmm... guess I didn't completely remove that tricky collision management stuff. I definitely have to think about this some more )
So what's my new stand on sce*.p programs? In the scenario described, I still don't think we need them for populating the business entity. The place where I think they could be useful is when the cached components are sourced from multiple different applications.
Then again, if you're using something like an ESB, you could publish the change once and configure an ESB process which ensures that all of the appropriate applications are notified of the new data and each of the applications could extract the data that they own, committing it to their persistent storage.
From my ramblings above, the possible conclusion I am
coming to is that it allows the data source to be
something other than a simple read of the database.
The data access object responsible for assembling an
Order doesn't need to know whether the local copy of
the data source is a cached table from another
service or cached table from a local database or a
component that accesses the data row by row from a
local data base.
Sure
I question whether any single business entity will
have two different kinds of data sources in a real
world application ie. partly db-table, partly XML,
partly WebServer, etc. I could imagine that in such a
case, a different design of the business entity might
be in order (ie. break it up into more appropriate
entities)
I've posted a question on a thread to see how many people in reality actually use a non OE, and hopefully more specifically a non DB data source, it will be interesting to see how many, and then how they handle it.
...snip
I think the business entity should not know
anything about a data source.
I think I would take this one step further in
suggesting a Finder layer between these existing data
access layers. A Finder should be created by the
object that needs it, but the Finder should attach
itself to the component responsible for providing the
object of interest. I say "attach" because it should
request a connection to an existing object or the
request should cause the instantiation of that object
without the Finder being aware which has happened.
The same should happen between this "data access"
component and any source components, i.e., it should
request attachment from a service component and it
should be ignorant about whether it is receiving a
previously instantiated version or one that was
instantiated for the purpose. Note that one of the
advantages of this is that one can have multiple
versions of a data source object and which type is
instantiated can be a configuration option.
One of the possibilities here is through discovery, and when the service request arrives at the service provider, the first step is to perform discovery to turn a service request into a physical set of components and methods that need to be used in order to service it.
As a final note, I would also like to observe that
there is a certain "twist" in design which derives
from the fact that a database connection is a
property of the session. In other OO languages, one
has a connection object and a particular data source
connects to a particular instance. Thus, one can
have multiple data base connections in a single
session and can connect any one particular data
source to the appropriate connection. I'd like the
ability to have a data base connection object in this
way in the OOABL. In particular, I would like to
have both ABL and SQL versions so that I could use
one or the other as appropriate. Wouldn't it be nice
to include a SQL connection instead of an ABL
connection when the function of the component was
reporting and one knew that a full table scan was
needed?
Well, here again I think potentially steps in the DSO, maybe the DSO should handle the connection and it shouldn’t be a session wide thing So then again, the responsibility for the physical data source, how to connect and query it is at the lowest level, and the DAO doesn't care.
Well, here again I think potentially steps in the
DSO, maybe the DSO should handle the connection and
it shouldn’t be a session wide thing So
then again, the responsibility for the physical data
source, how to connect and query it is at the lowest
level, and the DAO doesn't care.
Of course, this is indeed possible today if each of the DSOs are accessed through an AppServer. Forgetting the performance overhead of such a design for a moment, this would mean that each of the DSOs are completely independent with respect to record locking and transactions. No problem for read operations, how would you handle the writes and transaction scoping?
Okay, now I want to remove the performance overhead, I look forward to reading the specs on a single ABL client having multiple, indepentent connections to the same database at runtime, which can somehow be consolidated for transaction scoping purposes
I question whether any single business entity will
have two different kinds of data sources in a real
world application ie. partly db-table, partly XML,
partly WebServer, etc. I could imagine that in such a
case, a different design of the business entity might
be in order (ie. break it up into more appropriate
entities)
>
Whereas I think this is not only likely, but nearly
certain. E.g., whose responsibility is it to know
about a Customer ... usually this is centralized in
Accounts Receivable since that is the place that
knows about credit limit, payment history, late
payments, credit hold, etc. So, an Order service is
going to have to reference the AR service in order to
get Customer information and then is going to have to
update the AR service to reflect value on order and
ultimately value invoiced.
I can see where you're coming from. You can read my earlier post regarding caching mechanisms, but for an interactive system, I'm not convinced that without some kind of caching mechanism, real-time (synchronous) requests for data are going to meet the requirements for uptime, etc. I don't want my application to be negatively affected (ie. not available), just because the other application isn't running or the network is down, etc. Iff all of these things have very high guaranteed uptime and the performance is there, would I consider architecting a solution in such a fashion.
Getting back to the question of a single business entity with two different kinds of data sources (ie. partly db-table, partly XML, partly WebServer, etc.), I don't think your example actually has two different kinds of data sources. I can imagine that your Order BE needs information about the customer, but would it ask for that information from the Customer BE? Now we're back to each BE having it's own (and single kind of) data source.
Is this how you would build it?
Pahh, who needs transactions
Funnily enough, this is a 2nd exchange proposal of mine, "Transactions in an distributed SOA world" So keep the discussion going, I might need to grabb some ideas !!
Okay, now I want to remove the performance overhead, I look forward to reading the specs on a single ABL client having multiple, indepentent connections to the same database at runtime, which can somehow be consolidated for transaction scoping purposes
Something like DEFINE BUFFER buffer-name FOR source-buffer.
?
In the scenario where we have a number of AppServer
agents servicing clients with business entities and
these business entities are based on "foreign" data
(ie. we need to cache it somehow), should each Agent
be keeping it's own cache or should we be sharing
this cached information through an OpenEdge database
that each Agent is connected to?
Sorry, but I have missed the part that you're going from data access components to another agent running on an AppServer. A data access tier wouldn't be calling a layer above and it would certainly not be calling an AppServer agent in my vision.
A little comment on the way DSO are implemented in AutoEdge. DSO are in a separate .p and are instantiated and destroyed on each request. Isn't this going to affect performance ? I think that, once started, DSO should stay in memory, like BE and DAO. Is there some reasons why it has been done like that ?
What are your thoughts on business entity which requires different logical data structure ? By that I mean different relations, different order in which tables are read, optional tables, optional fields. Do we really have to make multiple versions of the physical implementation of a Business Entity like it has been done in AutoEdge ? My inclination is that there should always be only one physical implementation of a BE, even if its logical structure is highly dynamic.
I was not questioning whether or not it would be possible to
change the physical datasource at runtime, but I was questioning
what you want to achieve with it. Obviously, there is a need for a
continuing agreement about a minimum necessary contract. Still, one
of the particularly nice things about SOA is the ability to evolve
the source by adding new elements into the XML and still have the
consumer able to use this message without modification. But, the
core here is that I am not really talking about shifting from a
datasource in application A to a datasource in application B,
although that is certainly possible with some work, but rather
shifting from a local copy of A to a remote copy of A because the
application becomes distributed. With the flexibility of deploying
an SOA across one to many boxes plus alternative sources such as
are provided by ESB data engines and DataXtend technology,
flexibility in data source would seem to be a fairly key area in
flexible deployment.
Well, as I seemed to convince myself earlier in the discussion, I
still think they may have some value for insulating the component
which assembles objects from the actual data source. E.g., we might
start off with a data source which is a simple direct access of a
shared database because everything is on one machine. Then, we
might segregate the AR and Order services to two machines and
create a cache for the Order service which might be a dedicated
database or it might be a database which handles all caching for
all Order services. For some other type of data, we might decide
that a local cache is meaningless because it is too volatile or too
rarely accessed and simply send off for it on an as needed basis.
Having a separate layer to make those choices can allow the layer
which does the assembly into objects remain fixed despite changes
in implementation or deployment. One interesting question in the
context of OO is whether any object assembly happens in the data
source level or whether that is deferred to the data access
component. I am currently thinking the latter, but I feel free to
change my mind.
Are using it now and might want to be
using it in the future are not necessarily the same answer, of
course. More importantly, I think there is a great potential for
development in which all of the data is persisted in OE databases,
but there are multiple of these on multiple machines so that the
proximate source of a particular piece of data may be an XML
message over Sonic. That is proximately a non-OE data source,
although the data ultimately lives in OE. Well, here again I think potentially steps in the DSO, maybe the
DSO should handle the connection and it shouldnÂ’t be a
session wide thing So then again, the responsibility for the
physical data source, how to connect and query it is at the lowest
level, and the DAO doesn't care. I would like to have this
option, particularly since it would allow for the possibility of
using a SQL connection as an alternate when the fetch was
appropriate to the virtues of SQL.
I'm guessing that is a big piece
of forgetting. We have two phase commit, don't
we?
No, it
would get data from the customer data source, not the customer
domain object. There is no need to build a customer object here. At
least, that is what I am thinking at 11:36 PST ...
A very juicy
topic, indeed. Perhaps you should sketch out some ideas and start a
thread. Then we can all rip it apartHHHHHHHHHHhelp you
polish it up.
I was thinking more in terms of DEFINE CONNECTION.
I am
with you in that I am still fuzzy at this point about how each
piece connects. AppServer seems to me to have been designed for
handling the client-server issue, but a lot of what we are talking
about here is server-server. I haven't tried it, but my impression
from remarks on PEG is that agent to agent connections using
AppServer are disappointing in performance, something that one
doesn't notice when there is a network in the middle. In the
absence of a significantly multi-threaded client, I suppose that we
need one of two things. One would be a highly performant kind of
bus to which multiple sessions could connect and on which they
could communicate, but one which was ABL specific and thus had
minimal overhead and fewer fancy features than Sonic. The other
would be a highly performant way of making session to session
connections with directory control so that one would locate a
component and then connect to it. This might be possible with
sockets, but that seems like a lot of overhead and doesn't really
solve the problem because of the limitations of the single threaded
session. Likewise pipes.
Something like DEFINE BUFFER buffer-name FOR
source-buffer.
I was thinking more in terms of DEFINE CONNECTION.
I know that's what you were thinking about.
My question is - what's the difference?
Are you sure of that? I'll have to go back to the
code to check, but I thought these were located via a service
provider.
Can you give an example to
make this more concrete and less theoretical? To me, this sounds
like a perfect case-in-point for a superclass and/or one or more
interfaces and a set of subclasses. One instantiates the
appropriate subclass for the current need, but interacts with it
using methods defined by the superclass or the interface(s) so that
all subclasses effectively have the same signature, at least in
most places. The superclass includes all the common code and the
subclasses include the "deviant" parts.
DEFINE BUFFER is
simply a separate buffer within the same connection. DEFINE
CONNECTION would be entirely separate, i.e., a different database,
a different access method such as SQL, a different set of
parameters like RO, etc.
>DSO are in a separate .p and are instantiated and
>destroyed on each request.
Are you sure of that? I'll have to go back to the
code to check, but I thought these were located via a
service provider.
Looking at the fetchWhere in daCar.p, it begins by some calls to addDataSource() and ends with a call to removeDataSources().
These two procedures are defined in daSupport.p.
addDataSource() starts the DSO without checking if its already running.
removeDataSources() unconditionally deletes all DSO from memory.
But, the core here is that I am not really talking
about shifting from a datasource in application A to
a datasource in application B, although that is
certainly possible with some work, but rather
shifting from a local copy of A to a remote copy of A
because the application becomes distributed. With
the flexibility of deploying an SOA across one to
many boxes plus alternative sources such as are
provided by ESB data engines and DataXtend
technology, flexibility in data source would seem to
be a fairly key area in flexible deployment.
I find it hard to imagine that you want to access a data source component on another machine. This means that you're going to expose that data source as a public and remote service. This leads to a data source that has no encapsulating business logic (since that logic will be on the other box).
Sounds wrong. Also tends to make its definition as a singleton in the new overview document a bit of a joke.
Is it daCar.p that is handled by the service?
I'll look later, but I am still digging out from the morning mail today!
I would like to have this option, particularly since
it would allow for the possibility of using a SQL
connection as an alternate when the fetch was
appropriate to the virtues of SQL.
This route will be very scary, since your "4gl buffers" might become dirty without the ABL-runtime knowing it when you read the same information via sql as well. And you might deadlock yourself when the 4gl has a pending transaction.
You could experiment with it by using an ODBC-DataServer connecting to an OpenEdge database and use the "RUN STORED-PROCEDURE" command in the ABL... THis can give you a resultset in ABL.
I am thinking of the SQL strictly for reporting type connections where one is reading no-lock and the equivalent of -RO.
The access to a data
source on another box is access via a Sonic message to another
service, not to the individual component itself. How else would you
expect the AR service to own Customer and for the Order service to
get a copy of the Customer info?
I think the question of how you architect such a
caching mechanism is a very interesting one.
In my simple view a consumer should be responsible of caching things and not the producer. The producer might cache things, but it will cache it for other reasons. A consumer might be on the other end of the wire, so he knows it would make sense to stuff a result in a cache. The producer should recognize dirty copies, so in it's protocol it should make sure it can detect them.
A very practical sample: when you go to the shop to buy some beer, you probably buy a sixpack instead of a single can. It makes sense, since it will save you 6 roundtrips compared to buying single items. The cashiere didn't persuade you to buy a sixpack, no, you did it yourself... But when you only have money (resources) for one item, you have just one choice: buy the single item. Even when there is a bargain when buying the sixpack
The access to a data source on another box is access
via a Sonic message to another service, not to the
individual component itself.
So all your data sources are Sonic services? I don't know Sonic from a practical point of view, but I know it's a message broker. How on earth are you going to manage and coordinate transactions and avoid locking conflicts? Yes, there is two-phase commit, but this another ballgame. Doing things in one process is hard already, but distributing the data access tier as well..... that's a challange!
I can see a need to create a stock allocation service, a credit check service, etc, but I see them orchestrated by some workflow kind of task. This is way up in the architecture, far away from the data source level...
Caching needs to happen in multiple places. E.g. it is sensible for
a BL session to cache a table of state codes so that it doesn't
have to keep going back to the database for something so simple and
static, especially if the actual source of that data is across one
or more wires. But, but the same token, there is every reason to
have a service level cache for that data to avoid lots of trips on
the bus. At the authoritative source, a cache might make no sense
for this kind of data because it is relatively quick and easy to
instantiate a new collection. But, it is not always easy for a
cache to know when something is dirty. It only knows this if it is
the authoritative owner of the data. In fact, if the data source is
to be statelfree, replicable, and separate from the physical cache
itself, as we have been discussing above, then any one source isn't
going to be directly aware than one of the other sources has
updated the data. Of course, with this kind of structure, the next
time it goes to get a copy of that data, it will get the updated
copy, but it doesn't automatically know that the copy that is there
now is different from the copy that it provided some other
component 10 minutes ago. I think this implies the need for version
stamps in caches so that when the other component returns the
record for update, the data source can notice that the version of
the before image is different from the version of the stored image
in the cache and respond appropriately. We still need the
distributed pub/sub though.
Well, I am still puzzling this out, but no. A local
data source is just that, local. A remote data source is accessed
by communication with a different server and that is what I see
happening via Sonic, although I would love to also have a
lighterweight ABL to ABL protocol for communication within the same
machine as another option. Which said, ESB does include handling of
distributed and long-lasting transactions. This doesn't necessarily
mean holding a lock on any one record for a long time. I can have a
process in service A which has an update that involves service C
and service D as well as itself. The message goes out to those
other two services and they make quick updates and return a
completion message and then the local service makes its quick
updates. Backing out an incomplete update in this context can
certainly be more complicated than a simple multi-table DB
transaction, but hey, such is the nature of distributed systems!
I think this implies the need for version stamps in
caches so that when the other component returns the
record for update, the data source can notice that
the version of the before image is different from the
version of the stored image in the cache and respond
appropriately. We still need the distributed pub/sub
though.
Notifying distributed listeners is an enormous overhead and kills the scalability of the system. Think about a shop and it's customers who buy a newspaper. Is it really the shop's responsibility to track and notify all customers who bought a paper and tell them the next day the paper is no longer up to date?
Which said, ESB does include handling of distributed
and long-lasting transactions.
Yes and it also implies that you have designed your database tables in such a way they support long running transactions as well. Think about storing incomplete orders that need some additional servicing and throwing them away when the "long running transaction" aborts it. "Long running" most of the time applies to business to business communication, where you order an item, wait for some delivery confirmation and approve the order. There could be days in between...
Sonic is not meant for fine grained and low-level operations (that's what sonic people evangalize as well).
We don't have any disagreement here.
The order may or may not have to be persisted, depending on the state. If it is just being filled in and it is OK to start again if the system crashes, then it doesn't get stored until it is completely filled in. If one is sending off a shipping request on it, then clearly it needs to be stored.
I wouldn't expect most transactions to be actually long running, but, lets take the case of an order which has been sent for shipment and it is now complete and can be invoiced. This might involve updates to the warehouse service, the order service, and the AR service as well as possibly some others like an EDI service. Suppose one of those services is down? Presto, an unintended long term transaction.
Not necessarily,
particularly if every session is linked to Sonic where one can
broadcast a message.
I wouldn't expect most transactions to be actually long running, but, lets take the case of an order which has been sent for shipment and it is now complete and can be invoiced. This might involve updates to the warehouse service, the order service, and the AR service as well as possibly some others like an EDI service. Suppose one of those services is down? Presto, an unintended long term transaction.
I can't see that - I'd think the transaction should be "undone", and a status set somewhere that something couldnt' get done, and can't get done until service "X" is back.
Which basically means the tx is rolled back, and the order transitions to a new "pending completion" kind of state.
But holding the transaction open because of a downed service? No way.
I think you are looking at this from the perspective of
traditional ABL DB transactions where the ideal is to get in and
out real quick, minimizing locks. But, not all transactions are
like that. I know of examples from the Forté world
where the pre-Forté reality was what amounted to a
transaction that would take 2-3 days to complete. There was no
mechanism to manage this, so when it fell apart and couldn't be
completed, the whole thing had to be undone by hand. With the
Forté equivalent of the ESB this was reduced to a
matter of hours, but was still far from instantaneous because it
involved dozens of computers, not all of which could be expected to
be on line at any given time. Handling that takes some fancy
footwork, but it addresses the business reality and makes the
system far more resilient. Note, this doesn't mean that the
workflow doesn't involve warnings. E.g., the Forté
example I am thinking of included logic to automatically reboot
unresponsive machines and/or to page people to do it manually when
that didn't work.
We don't have any disagreement here.
...
I wouldn't expect most transactions to be actually
long running, but, lets take the case of an order
which has been sent for shipment and it is now
complete and can be invoiced. This might involve
updates to the warehouse service, the order service,
and the AR service as well as possibly some others
like an EDI service.
Sure, those are all very real life services, but I don't see how this will be addressed at the data source level. When orchestrating services you will have to decompose the workload into smaller units, else it will be a monolithic application.
Let's assume the user submits an order to the system. The order service will process the request. Since processing an order is a complex request, it will be decomposed into smaller services and the initializing service is responsible for dividing the request into tasks and delegating them to the proper sub-services. It should also manage the transaction scope. A service makes use of entities, datasources and what have you.
I don't see a data source component calling another external (AppServer) service, nor putting something into a sonic queue. It will be the toplevel service or sub-services (workflow) that potentially will communicate with sonic. A data source will communicate with a database or a file system, something that's local to it. When something has to call out to a (potentially) remote service, a proxy class should be involved. A proxy class encapsulates the remote object and behaves like a local object. It's primary goal is to manage the lifetime of the remote object and to impersonate the remote service.
So we have an AppServer exposing medium grained services, we have the AppServer internal architecture with tasks (workflow), transactions, entities and datasources (when you feel it's necessary). We might have sonic for B2B-communication and/or workflow enablement. It will decouple AppServer services from each other and will take care of transforming external messages to well defined internal messages which are connected to internal services. And we have clients calling our AppServer services.
When you want to hide the physical AppServer partitioning for the clients, you could use sonic as the service you communicate with. It will be the central configuration point. In that case the client won't call an AppServer directly, but it would post a request to the queue. When the client has something usefull to do, it could even do an asynchronous sonic request and poll/be notified for the result.
As noted previously above and elsewhere, I think an
ESB has a *lot* of potential used within an integrated suite of
applications, even if they are all part of the same ERP and all
come from the same vendor. Not the least of these is the ability to
externalize workflow into the ESB workflow engine, thus allowing
the customer to customize workflow without reprogramming. It may
also be the only way we are going to get distributed pub/sub
events.
What are your thoughts on business entity which
requires different logical data structure ? By that I
mean different relations, different order in which
tables are read, optional tables, optional fields. Do
we really have to make multiple versions of the
physical implementation of a Business Entity like it
has been done in AutoEdge ? My inclination is that
there should always be only one physical
implementation of a BE, even if its logical structure
is highly dynamic.
I certainly think this is one of the areas we'd look at, and maybe do different if we were to do it again. I think we've come to the conclusion that where possible we'd strive for one BE managing different 'views' or logical data structures. It may not always be possible, feasible or desirable, but I think it's probably a good thing to aim for.
>snip..
No, it would get data from the customer data source,
not the customer domain object. There is no need to
build a customer object here. At least, that is what
I am thinking at 11:36 PST ...
Which is why I think you'd use a DSO, and allow the Order DAO to use a customer DSO as part of it's process.
But holding the transaction open because of a
downed service? No way.
I think you are looking at this from the perspective
of traditional ABL DB transactions where the ideal is
to get in and out real quick, minimizing locks. But,
not all transactions are like that.
>.....snip
One of the things we're looking at is the concept of managed and un-managed transactions, where managed are those transactions where you can rely on the OE platform to provide built in support, DO TRANSACTION... etc, due to the data source your using, as opposed to an un-managed transaction where you can no longer rely on the in built transaction handling of the platform, and therefore have to create your own transaction manager, compensating transactions etc. This would imply that as soon as your process uses an un-managed data source (i.e. a datasource where you cannot rely on the inbuilt txn handling of the OE platform), then all your process is an un-managed transaction and you will have to create an appropriate transaction model to handle all this. (assuming your performing updates of course)
If anyone has any experience of this in their application then I'd be interested to know as I'm looking to do a transaction whitepaper which will be used for my exchange presentation
What are your thoughts on business entity which
requires different logical data structure ?
Can you give an example to make this more concrete
and less theoretical?
Basically this means the BE should be able to returns different structure of a prodataset, depending on what the client requests.
Here's some simple examples using an Order BE (which manage Order and OrderLine db tables):
Client requests:
- a set of Orders, without its OrderLines;
- a set of Orders, with some Customer info (name, address, phone, email, ...);
- a set of Orders, with Customer name;
- a set of Orders, with its OrderLines and some Item info;
- a set of Orders sorted in a specific way;
- a single Order with only some specific fields;
- all OrderLines of a specific Order;
- every Orders which have at least one OrderLines which satisfy a specific condition. This condition can be validated using an indexed field in OrderLines table. Consequently, the query fetching this data should start by "FOR EACH OrderLines ".
Also, in a server to server interaction, I think a request to a BE should be able to return only a single value when there is no need for a ProDataSet. An example of this could be: BusinessEntity1(BE1) has to do some validations. One of its validation requires to know if a specific condition is true or false for a table which is managed by BusinessEntity2(BE2). BE1 make the request to BE2, which returns only the condition value, no prodataset involved.
NB: I know it is written that a BE shouldn't access another BE, but this is something I'm still struggling with. More questions will follow on this subject.
Why would your unmanaged transactions not come under the coverage of the Sonic distributed transactions ... other than, of course, the requirement for PSC to package and price Sonic so that we can affordably use it in such a way.
One of the things we're looking at is the concept of
managed and un-managed transactions, where managed
are those transactions where you can rely on the OE
platform to provide built in support, DO
TRANSACTION... etc, due to the data source your
using, as opposed to an un-managed transaction where
you can no longer rely on the in built transaction
handling of the platform, and therefore have to
create your own transaction manager, compensating
transactions etc.
This link provides an overview of some of the terminology http://java.sun.com/blueprints/guidelines/designing_enterprise_applications_2e/transactions/transactions8.html. I don't think the ABL-runtime is suited for creating a compensating resource manager nor distributed transaction. Besides that we have some bad experiences with COM+ and "declarative transactions". Due to the nature of the declarative transaction, it tends to lock too much data. In COM+ it makes you have to split your components in readonly and transactional components, which is not very intuitive.
Why do you think
that? OO applications are full of interactions between domain
objects. What else are they supposed to do?
Also, in a server to server interaction, I think a
request to a BE should be able to return only a
single value when there is no need for a ProDataSet.
An example of this could be: BusinessEntity1(BE1) has
to do some validations. One of its validation
requires to know if a specific condition is true or
false for a table which is managed by
BusinessEntity2(BE2). BE1 make the request to BE2,
which returns only the condition value, no prodataset
involved.
Why does this interaction involve the data access
layer at all? I.e., why is this not just a method
call on the other BE? Seems perfectly reasonable and
normal and the meat of what one expects to happen in
the BL to me.
Sorry, I'm thinking out loud, and after reading my post, this quote may seems out of context and missing some detail. I was more talking about business entity layer than data access layer. Using prodataset as parameter for BE methods facilitate the use of a generic service interface(proSIgateway.p) to make request to a BE. If however, the BE has some public methods that have different parameters signature, how calls to such methods can be integrated into a service interface ?
Should a server-server call use the service interface or use the BE directly ?
NB: I know it is written that a BE shouldn't
access another BE
Why do you think that? OO applications are full of
interactions between domain objects. What else are
they supposed to do?
OERA Quick Ref. Guide states that a BE cannot access another BE.
Also, Mike said:
... in terms of how the components interact with each other, such as should a BE call another BE directly, we've put what we consider to be sensible guidelines down in the quick ref guide mentioned above. So if you need to for example use more than one BE to service a request, then this should be made a task, and it has the job or then doing the orchestration between the BE's. This again should allow for greater flexibility and re-use, as you not building any dependencies at the BE level, and it removes the possibility of an accidental call architecture within the Business Components layer!
I guess I need to go grub around in
the innards of AutoEdge a bit more because this just doesn't make
any sense to me. Mind you, I am thinking and speaking in OO terms,
which might allow for some differences, but nevertheless ... To me,
a domain object has a whole bunch of methods, potentially none of
which use ProDataSets. If PDS have a role, it is as part of the
mechanism for collections, but when one is working on an order,
one doesn't have a collection (unless collection is used for the
lines, but at least one doesn't have an Order collection). No way I
want to create a collection for one entity. And, no way I want the
overhead of PDS for every parameter. I'll need to poke around a bit
more because this makes no sense to me at all. As for the Quick
Reference Guide, I will also have to take another look, but it
seems wrong to me. To be sure, there is a virtue in the structure
of having a "task" or "process" object manage the interaction, but
I'm not at all prepared to suggest that is universal. If the
entities are interacting based on their method signature, how is
that any flaw of encapsulation or any dependency that doesn't exist
in the task object relying on the signature?
Which is why I think you'd use a DSO, and allow the Order DAO to use a customer DSO as part of it's process.
Ahh... that's starting to make some more sense then.
So to take this one step further and linking back to the discussion about security, if you have to allow/disallow users access to records in a particular table based on values on that table (eg. if the user is a salesrep, they are only allowed to see their own customer's details), this data would have to coded in the customer DSO, not the Order DAO, right?
Security as a whole probably gets applied
in many places, but certainly this kind of security is one that one
would probably want to support with an index, so it would be very
appropriate to apply it in the DSO.
if the user is a salesrep, they are only allowed to
see their own customer's details), this data would
have to coded in the customer DSO, not the Order DAO,
right?
Seems tricky to me, since now this DSO will always filter the rows. Suppose the server handles a different type of request that needs to update that order "behind the screens" due to some business rule. When you ask your DSO, it won't give it to your layer above... So now you will either need to move that update to another class or you will have to disable that validation some way or the other.
More accurately, it will always filter the rows **for a
request related to a specifi salesrep** . This doesn't mean that
one can't design the same component so that it filters for a group
of reps, i.e., all those relating to a given manager, or does no
filtering at all for some mass update or reporting function. These
are all just variations on the where clause.
This doesn't mean that one can't design the same component
so that it filters for a group of reps, i.e., all
those relating to a given manager, or does no
filtering at all for some mass update or reporting
function. These are all just variations on the where
clause.
As long as you're able to make the right choice. The class one level up might not know which version to call: the filtering one or the non-filtering one...
No reason for it not to be one class. Like I said, just themes and variations on a where clause. I would never create different classes for this.
...
This link provides an overview of some of the
terminology
http://java.sun.com/blueprints/guidelines/designing_en
terprise_applications_2e/transactions/transactions8.ht
ml. I don't think the ABL-runtime is suited for
creating a compensating resource manager nor
distributed transaction. Besides that we have some
bad experiences with COM+ and "declarative
transactions". Due to the nature of the declarative
transaction, it tends to lock too much data. In COM+
it makes you have to split your components in
readonly and transactional components, which is not
very intuitive.
Thanks Theo I'll take a look.
Which is why I think you'd use a DSO, and allow
the Order DAO to use a customer DSO as part of it's
process.
Ahh... that's starting to make some more sense then.
Glad I make some sense
So to take this one step further and linking back to
the discussion about security, if you have to
allow/disallow users access to records in a
particular table based on values on that table (eg.
if the user is a salesrep, they are only allowed to
see their own customer's details), this data would
have to coded in the customer DSO, not the Order DAO,
right?
Yes, I think so. As Thomas states security will be at many levels, but I think your assumption here would be good.
snip....
OERA Quick Ref. Guide states that a BE cannot access
another BE.
Also, Mike said:
... in terms of how the components interact with
each other, such as should a BE call another BE
directly, we've put what we consider to be sensible
guidelines down in the quick ref guide mentioned
above. So if you need to for example use more than
one BE to service a request, then this should be made
a task, and it has the job or then doing the
orchestration between the BE's. This again should
allow for greater flexibility and re-use, as you not
building any dependencies at the BE level, and it
removes the possibility of an accidental call
architecture within the Business Components layer!
We went thought quite a few discussions on this, and using Business Tasks in this way seems to give the best flexibility in terms of re-use (of BE's) whilst providing a clean transaction model and not creating as I said before an accidental call architecture, where BE1 calls BE2, BE2 calls BE3, BE3 calls BE1 etc, which just as the ESB is trying to orchestrate app to app communication and remove the whole accidental architecture, I see the Business Task performing the same function internally to remove the accidental call architecture! But don't forget, at the end of the day the OERA is a set of guidelines and recommendations, whether you choose to follow them, deviate slightly or totally disregard them is obviously an individual choice. Having said that, I do feel the OERA offers more advantages than disadvantages, it's just a case of determining what is most practical to your situation when you come to implement.
But keep it coming, this is good stuff
Mike, I don't think anyone questions the notion that using tasks for local orchestration is a fine and wonderful idea. What I do question is the absolute exclusion of one one BE calling another BE. I wonder if this is because I am thinking in terms of objects, despite AutoEdge still being .ps. For example, it is a fairly common thing for one object to spawn another, but with this rule, any communication between them would have to go through the host task ... seems screwy to me.
One request I would like to make though is to be careful about statements like:
To me, OERA is a concept or set of concepts. PSC is one of the promulgators and evangelists of this concept, but it is hardly unique to PSC nor new. The specific reference term is PSC's, but this is not at all like ESB, where Sonic can lay some legitimate claim to having originated what is now an industry standard term. OERA is just the local handle for a set of concepts which are widespread and predate the term. Consequently, if my view of the rules of OERA and your view differ in some particular way, I would be unhappy about the idea of this being seen as a "deviation" ... especially when I think mine is more right!
I.e., I think we have two types of variation going on. One type is the pragmatic realization that not everyone will do things in the "right" way to fully implement OERA. It is in PSC's interest to be polite to these people, but as a consultant it is my duty to point out to them that the greatest benefit is going to come from going all the way. The most extreme case being the people who are still using V6 procedural model code ... you still want them on 10.1B, but I want them to start evolving.
The other type are the variations in interpretation of what OERA actually is and how it should be done. We all agree on the concept, but may differ in details or focus. There, I don't think PSC should be placing itself in the position of being the arbiter of the one true faith.
I'm sure that isn't what you meant, but just a request to keep it in mind.
I also question why a BE shouldn't call another BE. I don't think it's an absolute rule and that this should always be transformed into a Business Task.
As I understand it, a Business Task is a multi-step operation, grouping multiple requests to different Business Entities and managing context between each requests. The client is unaware of the multi-steps involved, it is like a single operation to him. There are multiple of cases where I see good use for a Business Task.
However, I don't think that request to other BE should always be done inside a BT.
A call structure where BE requests are independent:
|-BE1
|-BE2
|-BE3
is different than when a request's return value is mandatory for the calling BE to proceed or when the request must be done inside an operation:
|-BE1
| |-BE2
| | |-BE3
| |-BE2
Here's an example:
OrderLines are saved in Order BE. For each OrderLines, a complex validation must be performed, which require data coming from the Inventory BE. Also, when saving, I need to update some others tables using the Inventory BE. Transaction is scoped to each OrderLines and I want the Inventory update to be done inside this transaction.
Just a sidenote to save some space: does anybody know how deep a thread can get or how this thread started
I also question why a BE shouldn't call another BE.
I have the same question...
Here's an example:
OrderLines are saved in Order BE. For each
OrderLines, a complex validation must be performed,
which require data coming from the Inventory BE.
Also, when saving, I need to update some others
tables using the Inventory BE. Transaction is scoped
to each OrderLines and I want the Inventory update to
be done inside this transaction.
In this scenario it means that you have to move most of the responsibilities of the Order BE to the task class to achieve the "you shouldn't call another BE paradigm". This makes me question what the exact responsibilities of a BE are, since basically it delegates persistence to the data source/access class and it will add some simple validation. The BE looks like an Assembler (pattern), since it will construct complex objects for you.
The real object interaction will take place in the task. So the task will be a Command pattern:
public interface ICommand
{
void Execute();
}
A concrete command will do one thing and a command can be split into sub commands.
I
think that, according to the AE formulation, the assembly takes
place in the DA, not the BE, but I think you may have pointed to a
key issue in this discussion. Those of us who are discussing the AE
formulation tend to be doing so from the perspective of OO, but AE
is not OO. It has some OO like trappings, not the least of which is
the EA model that accompanies it, but it doesn't seem to me that it
was really approached from the point of view of creating an
application structured in an OO way as much as it was structured to
provide a .p version of an OERA architecture. I think this is
important because, in an OO perspective, the BE is a domain object
and we know that domain objects are supposed to encapsulate the
data and behavior of the "thing" they represent. I think that, in
an OO perspective, domain objects inherently interact with each
other. This is not to say that there aren't also important classes
which relate to processes, but it is to say that a significant part
of the interaction of domain classes is between themselves. Here,
in order to avoid the potential complexities of A calling B calling
C which might call A, logic is removed from BEs into BTs in a way
that one would not be so likely to do if one were thinking of real
classes.
I'd like to return to a topic Theo brought up earlier, which is the number of singletons in this design. When the topic first came up, I was thinking only in terms of the data access layer components being singletons ... a design which can have some virtues, but when I get back to the code it turns out the BE components are singletons as well. I'm wondering if this is an artifact of thinking in terms of SPs and PPs instead of objects. Certainly, I would never design any ordinary domain object as a singleton ... what if I needed two of them?
Frankly, I am even suspicious of having these be singletons in .p code. Apparently it works in the context of the specifics of the AutoEdge application and a single threaded client since, at any given time, one is almost always only going to be thinking about either one specific instance or a specific list, not multiple instances or multiple lists. But, I certainly wouldn't want to impose that as a general rule.
Note that one of the implications of this approach, in which a singleton BE is the holder for either an individual record or a collection, is that the interface to the BE is a ProDataSet, which might be regarded as heavyweight for a single instance of a simple object, although it does have the nice before and after aspects.
OK, OK, I know that one of the remarks which has been made to cover places where the handling of a particular item is not done consistently is that you wanted to illustrate more than one way of handling the problem, but ...
On line 140 of becar.p, there is a getServerOperation to obtain a procedure name and a handle for ValidateDealer. A couple of observations:
1. In becar.p, you have ValidateDealer, but elsewhere it is validateDealer ... good thing ABL is not case sensitive by default.
2. While this seems like an interesting way to decouple a conceptual operation from the specific code that implements that operation, I wonder what has really been accomplished here. The signature of the method is certainly fixed and the name of the operation must be kept consistent (unless you want to "overload" multiple names into the same method) so I'm wondering whether one has really gained any notable flexibility that would not have been there by simply attaching to a super or existing PP instance at the top of the code and just running the method directly. If one decided to change the name of the method at some point, that kind of refactoring is among the simplest to do. For any other kind of change, the connection is much weaker and more difficult to ferret out.
3. Is this not a case of one BE calling a method in another BE?
a design which can have some
virtues, but when I get back to the code it turns out
the BE components are singletons as well.
And the problem with that is that some of the BE-state is defined as a procedure global variable. See the "glIsImport" varaible and the "glRunValidation" mechanism: the BE can be in only one global state, so the code won't be reentrant. With "reentrant" I mean being able to call the BE from another component while the BE is already called earlier:
BT
--> BE.DoSomething()
> xxxx
> BE.DoSomethingElse()
The BE-instance is the same instance, but "xxx" might require a different setting in the BE.
The same applies to data source components, which can have only one dataset as their global context (see daSupport.p with "hDataSet").
When browsing the sources I find the seperation between "data source" ("da") and "data access" ("sc") very unnatural. Maybe the logic has been split to save memory, sinde the "sce"-sources are run non-persistent.
A comment in the "scecar.p" is a good example something is wrong:
PROCEDURE eCarAfterRowFill:
...
RUN getContext("TestdriveDetails", OUTPUT cContext).
...
/* Checking the context value if the Car Maintenance is called from the Change Car Dialog */
IF cContext <> "" AND cContext <> ? THEN DO:
...
phDataSet:GET-BUFFER-HANDLE("eCar"):BUFFER-DELETE.
END.
ELSE DO:
...
RUN denormalizeCar(INPUT hBuffer).
END.
So the component that's at the bottom of all the layers is concerned from which screen it has been started. I think you will get these kinds of exceptions when there is an artificial split in logic. I think the "sce"code should be included in the "da"-code, since it's pretty useless to "share sce"-code like it's done right now.
And instead of a "da"-procedure calling other "da"-procedures or an "be" calling another "be", we have "da"-procedures calling several "sce"-procedures. So:
BE
> DA
> SCE1
> SCE2
> SCE3
The problem with this is that all three layers are bound via the same temp-table/prodataset definition.
I'd like to return to a topic Theo brought up
earlier, which is the number of singletons in this
design.
..snip...
To save this thread getting even longer this might be a good topic for a new thread I'll post a considered reply as soon as I can (I'm currently in the midst of moving guys packing my worldly goods as part of a relocation to the US!!)
Mike
I'm
not so inclined to criticize this point individually because I
don't think a domain object has any business trying to be
stateless. This implementation is closer to a domain object
simulator ... and I suppose one might want that to be stateless,
but the whole thing seems to me to be the wrong concept.
While I agree that it is inappropriate for a data source to be aware of the calling context, I have become semi-convinced that there is a potential virtue in this two layer system because it can separate the physical access from the assembly, thus allowing flexibility in where the data actually comes from.
So, what more do you need than your laptop and a place to connect to the internet? What's important here after all!
But, in the end, their loss, our gain!
Surely there are others who could help pitch in on some of these points, though ... a bit of filler, at least? Spread the work?
I'm not so inclined to criticize this point individually because I don't think a domain object has any business trying to be stateless. This implementation is closer to a domain object simulator ... and I suppose one might want that to be stateless, but the whole thing seems to me to be the wrong concept.
So how would you handle recursing through something like a BOM explosion, etc?
Glad I make some sense
You did, until I thought some more
As far as I can see, re-using a DSO in multiple DAO's would only make sense if the DAO doesn't allow updates to the "foreign" DS.
In the example discussed, the Order DAO would use a Customer DSO to get some customer specific information. In this case, they customer specific information in question should generally not be updatable by the Order DAO - such updates having to be sent back through a customer BE/DAO. Is that how you also see things?
Of course, the exception here might be any customer specific information that is stored in the customer table in the database, like (intentionally denormalised) credit-check/outstanding balance information. And this information in turn would not be updatable from the Customer DAO.
If the above where true, I could imagine that the implementation of such a security model will become an absolute nightmare.
Thoughts?
In most cases, I agree, that these "secondary" DSO would be read-only. And, indeed, things like updating denormalized amounts would not be done by simply updating a total in the DSO ... who knows what might have happened to that total in the meantime ... but rather by sending a credit authorization to the AR service, which would both OK and update the AR service total.
I don't know that there is a perfect split, though. That will take more examples and thinking.