OO infrastructure - OpenEdge Development - Forum

All Replies

Posted by Thomas Mercer-Hursh on 11-Nov-2008 11:04

Some of us have been working on these questions since the 10.1A beta ...

You might be interested in http://www.oehive.org/OERAStrategies for a discussion about the role of OO in data access and in http://www.oehive.org/OERAOSI relative to OO components for OERA.

I don't see DataXtend as really having a role for ABL applications except in special circumstances, although I certainly agree that it creates a bad impression for it to not support OpenEdge databases and there are certainly environments where one has OE and other applications that it would be a fit.

The Appserver question is, I think, more complex and a place where I am less confident of PSC's vision. When it is discussed in roadmap sessions the focus seems to be on passing objects across the AppServer barrier and on single trip capabilities, i.e., the ability to instantiate an object or procedure, run a method on it, return the results, and, I suppose, delete the object all in a single message round trip.

Myself, I have little interest in sending objects over the wire since I am pretty sure that I can serialize their data and send that more efficiently and doing that I am technology agnostic about what is at the other end. It also means that I can instantiate a different object on the client from that on the server, e.g., a read only version or one with very limited update capabilities.

Instead, I am more interested in the idea of a structure for running persistent services, possibly in pools, and providing a namespace structure for interacting with those services and a way for identical services within a pool to share information, e.g., on the state of a long running query, the first portion of which has already been delivered to the client. I can imagine some ways to cobble this now, but I think we need to be looking at the way that application servers are used in other OO languages and seeing if we can't add those capabilities to what we have.

In fact, if someone is interested in working on a position paper on this topic, let me know.

Posted by Admin on 11-Nov-2008 14:40

My primarily concerns are not indeed (as you said in your reply) ways of transporting objects for the server to the client.

My concerns are :

- the system of persistence for objects in the server side, the management of transactions aswell when dealing with objects and finally - on the side of performance - the management of caching (and lazy-loading ?).

- finally, tools to ease the programming of "database" aware objects, which enables code generation from declarative models.

We've just gone through 3years .Net project (v.3.5), those requirements needed to be adressed in order to deliver the software. If such tools/infrastructures did not exist we would not be able to design the software.

Posted by Thomas Mercer-Hursh on 11-Nov-2008 15:39

Responding in reverse order

- finally, tools to ease the programming of "database" aware objects, which enables code generation from declarative models.

I think this is going to turn out to be fairly straight forward. Once we have a solid base of models from the OERA OSI effort, it seems likely to me that we can generate the data access classes from the UML without much difficulty.

I think there are a couple of different issues here, depending, of course, on your precise referrant.

Caching, lazy loading, etc. seem to me to be aspect of the data access layer and are part of the overall solution above. E.g., persistence object (table?) A may generate both a local DAO and a remote DAO for use on systems where the persistence is not local. The remote DAO will implement lazy load, caching, etc. according to the nature of the object. Yes, there is considerable art involved and we could use some language enhancements to make it easier and better, but the basic concepts seem fairly clear and straightforward.

I don't see transaction management to be a special issue with OO except of course for the issues we necessarily face in a distributed architecture which causes us to think about transactions in a new way, e.g., Mike Ormerod's fine talk on the subject.

Persistence of objects on the server was where I was going with my remarks about other application servers. For years now we have had it drilled into our heads, stateless, stateless, stateless. There is a lot of wisdom there. But, there is also wisdom in having persistent services, especially if one can still replicate services and provide them with a shared state. I.e., the individual service provider is instantiated, so there is no startup cost, and stateless between messages, but has access to a shared cache of state about all the sessions accessing it so that operations which benefit, like the next increment of some query, can be fulfilled by any of the agents.

It is possible we could approach this already by defining many different appservers, each for a particular service, but I think we need some clear thinking about what we want here and it will probably take some enhancements to the concept of AppServer. I am thinking that one piece of this is that if we can't get multithreading any time in the foreseeable future, then perhaps we can get some kind of highly performant intersession signalling and pub/sub.

Posted by Admin on 11-Nov-2008 16:26

I don't think you talk on behalf of PSC, am i rigth ?

Do you have any information on future initiatives from PSC regarding those subjects.

Posted by Admin on 11-Nov-2008 16:27

If someone from PSC (havaard, shelley, salvador, etc.) could give a focus from "the inside" on this topic ...

Posted by Thomas Mercer-Hursh on 11-Nov-2008 16:38

No, I don't speak on PSC's behalf ... but I do talk to them every chance I get!

In the Exchange roadmap sessions, the only things I recall hearing in this space are objects over the wire and single call as I discussed before. I have seen no evidence that the issues we are talking about here are really even under consideration. That's why I'm thinking that maybe we need to put together a business case similar in intent to the one I did on multi-threading (http://www.oehive.org/MultiThread) ... but it probably needs to be better than that one obviously was since it hasn't had any obvious effect.

Posted by Admin on 12-Nov-2008 00:40

Be sure to have Mike Omerod on the list of those you address this to.

Look for his previous posts (also related to Autoedge 2.0, OERA, OERI) in this forum to see how he thinks.

Posted by svi on 12-Nov-2008 10:56

"Remote objects" (the ability for clients to work with remote class object instances on the AppServer) is in our OOABL roadmap indeed, with potential considerations for singleton classes as well. As noted in some replies, discussion about object life cycle management in general, and perhaps specially in distributed configurations, is very interesting and may intersect core OOABL functionality with custom development methodologies or frameworks, including OERA. Please keep the discussion going!. Mike, feel free to chime with your thoughts when you can.

In addition we are looking into ProDataSet enhancements, some of them for easier OOABL interop.

Sylvestre, are you using ProDataSets? How would you see the functionality you describe from a PDS perspective.

Salvador

Posted by Thomas Mercer-Hursh on 12-Nov-2008 12:00

From my perspective, the issues here don't really have to do with PDS. An object may or may not contain a PDS as appropriate, but the issue here is the object itself.

While I think the topic needs some exploration, I think that one way of thinking about it is in relation to Sonic. Sonic gives us a way to loosely couple a collection of services at the expense of a certain amount of overhead. For many purposes, it is the ideal architectural foundation particularly since each service is technologically agnostic about the others. But, there are also contexts in which we would like to more tightly couple services, all OpenEdge services, probably and do so at a higher level of performance than I believe could be achieved with Sonic.

Just thinking off the top of my head ...

In other OO languages this shows up as an application server in which services run persistently and can be located and used via some naming service. Ideally, each service is replicatable for scalability, but there also needs to be a mechanism for state persistence when needed. In a multi-threaded language this might be accomplished by having the main process spin off a thread for a particular session, the main process really just serving as a gate keeper. The thread for any particular session might live for some time and multiple messages might go back and forth during its life. If one has multiple of the main process, this means that the client session needs to bind to the specific instance to reassociate with the corresponding thread, but this isn't the problem it would be in something like an OE AppServer since the main process or agent is not locked out of interacting with other sessions.

If we can't get multi-threading in the foreseeable future, one would like to imitate this structure using AppServer. The threaded part might be accomplished by starting a pool of sessions, each of which could be given work by the gatekeeper session. One might want to start a certain number of such sessions and have the gatekeeper able to start more if needed. Ideally, there would be a mechanism to closely couple these sessions, perhaps by something like an intersession pub/sub across something like a socket. The second part of this would be being able to bind to a given gatekeeper without locking that gatekeeper out from other requests. I.e., multiple sessions need to bind to the same agent. Obviously, they can't do so simultaneously, but the gatekeeper should be busy for very short bursts, so as long as the requests can queue, one should be OK. A timeout would be nice. The third piece would be a naming service for finding services .... what we have might already be enough, but I haven't looked at it closely enough.

The alternative to the gatekeeper approach would be for multiple agents to be able to connect to the same "thread" pool. I.e., a client session would connect and a process get started on its behalf and the return to the client would be some kind of ID ... or perhaps the client sends in its own ID. Then, any other agent in the group could use that ID on a subsequent call to connect to the right "thread".

Note that the remote "client" session need not be ABL, but it could be and it might not even be remote.

Posted by Admin on 12-Nov-2008 14:45

Well, I don't have too much trust in the near availbility of production ready data access components from Progress anymore. While discussing this in this and other forums it very often lead to a discussion about production ready client components for those data access components as well. And we've heard a couple of times, that we should not expect that ADM3 (based on OO and PDS) from PSC.

I'd never say that Progress should not invest time here. Every developer I've been talking to in the last years, did expect exactly that - were kind of disappointed that they need to fully understand PDS, OERA, OO and build their own data access layer and client components as well. Most people consider it as a big loss in productivity compared to previous versions of Progress.

Where will this lead to? Frameworks raising from 3rd parties (mostly involving additional expenses) or maybe even open source. Thomas is trying hard to kick-start such an initiative.

If Progress is just focussing on the core technology, especially on the AppServer, I'd love to see the ability to run stateless or state-free AppServer but still keep some context in the memory of the appserver. That might be some sort of either shared state-aware (where a single appserver might be locked by multiple clients) or even better the ability to persist objects alive in some kind of shared memory on the appserver for state-less. Knowing only little about the internals of the appserver I believe, that this is more a realistic short term goal than turning it into a multi-threaded appserver with a main thread as a gate-keeper and worker threads dependend on a client session.

What should be part of that context? I think database transactions may be avoidable. I could always live with the fact that transactions are scoped to a single appserver call. So intances of OO classes and everything inside those classes (prodatasets, temp-tables, no-lock queries, variables, references to other objects, etc., maybe persistent procedures) should be available on any appserver process that the owning client talks to (one at a time).

How could it be implemented? Maybe by having a property of the SESSION object on the appserver that we could assign our context object to. The appserver will need to take care of persisting the object between a clients appserver calls and making it available before the next appserver call. That property might be of the type Progress.Lang.Object. So everybody could implement his own ContextManager class.

Posted by Admin on 12-Nov-2008 15:44

One question regarding OERA, is that its documentation is currently not object oriented at all.

This will have dramatic impact on the overall infrastructure, and the effort will be necessary in order to take the bus (of OO) on time.

I have seen a presentation made by John Sadd on this (Class Based implementation of OERA) but i'm not sure there has been a detailed documentation and updates since (the doc was written in 2007 ... and OOABL has evolved since).

Posted by Thomas Mercer-Hursh on 12-Nov-2008 17:30

After the last 23 years, I don't think I really want PSC to spend energy on developing a framework unless there is a major shift in the way such projects get done ... say, like contracting with me to do it! I would much rather their energy went into enabling technology.

Knowing only little about the internals of the appserver I believe, that this is more a realistic short term goal than turning it into a multi-threaded appserver with a main thread as a gate-keeper and worker threads dependend on a client session.

The interesting question is what do we really need to get any particular effect. I get that multi-threading, per se, is hard. Saving context might not require anything other than perhaps the ability to store an object with state in the database. One could store the object's state without the object, but then one is paying the serialization and deserialization penalty.

But, I don't know that the kind of thing that I was describing is necessarily anything like as hard as multi-threading. Suppose, for example, one had an AppServer for a particular service (it might be nice to have a structure for hosting multiple services within an single AppServer, but lets equate an AppServer with a service for the moment). In that AppServer run the gateway processes.

Then suppose we have a second AppServer whose agents are used by the first AppServer for running "threads". Call the first AppServer G for gateway and the second on T for threads. And lets refer to the agents in each by the letter plus a number. I.e., the gateway agents are G1, G2, G3, etc. So far, I don't think we have anything new except a more difficult setup than usual if we have a lot of these.

So, let's start out with the idea that the G agents are stateless. They get a request, fire off a thread to work on the problem, and put an entry in a context table linking the T process to the client. It then returns, tells the client the job is underway, and goes back to waiting for work. The client connects again, not necessarily to the same agent, the agent looks up in the connection table, and goes to the thread process to see how things are doing. Still nothing new.

If we could have multiple clients temporarily connect to the same gatewary agent, then we wouldn't need the context table, but that sounds like a less good idea than the context table at this point.

So, what's missing. Well, one thing is that we might want the thread process to go off and do some work and then to let the agent know that it is done. That sounds like it might require tying the thread process to a specific agent and we would be back needing the gateway agent to link to multiple clients at the same time so that it can be managing multiple thread process at the same time. This is where cross session pub/sub would be useful, i.e., nudge my elbow when you are done.

If the thread is processing a big query and returning batches as requested, then the fresh batch should be available almost immediately and this request could be fulfilled by any agent, as long as there is a way to do something equivalent to running a method on the thread process to get the next batch. But, this means that any agent has to be able to run that method after looking up the right thread in the context table. This is a pretty loose form of cooperation.

For closer cooperation one might need something like a socket connection, but, maybe not. Maybe just being able to run a method on a pre-instantiated process would be enough.

Posted by Thomas Mercer-Hursh on 12-Nov-2008 17:34

Frankly, even the pre-OO OERA stuff is not, nor does it pretend to be, a model for production code. I don't think this would change even if John got around to writing some new OO whitepapers. In fact, if you have been following the discussion between John and I on some of the OO implementations, I'm pretty sure that I would disagree if they did get written ... unless I have convinced John to change his mind!

Seriously, though, I think these models are going to have to come from somewhere else. That is the reason for the OERA OSI so that people from real world production environments can say what they need and people can contribute their ideas.

Posted by Admin on 13-Nov-2008 04:20

After the last 23 years, I don't think I really want
PSC to spend energy on developing a framework unless
there is a major shift in the way such projects get
done ... say, like contracting with me to do it!

So you are applying for a job in Bedford? Don't like the weather in California anymore?

database. One could store the object's state without
the object, but then one is paying the serialization
and deserialization penalty.

I don't think you can store anykind of context in the database. I've had no-lock queries on my list. That would include the result list of a non 100% indexed query. I want such a query to stay opened between appserver his. This is the only way of effectively moving forward without the hit of reopening such a query every time. I can't think of any way how to serialize the state of an object (including the state of that query) in a DB.

So, what's missing. Well, one thing is that we might
want the thread process to go off and do some work
and then to let the agent know that it is done. That

Sounds like you want appservers talk to other appservers asynchronously. Ok. The problem is, what state the calling appserver is in (between two client requests) when the called appserver is done. A assume the async request completed event should re-activate the calling appserver, so that the async request completed event can do serious work.

Posted by Thomas Mercer-Hursh on 13-Nov-2008 11:27

So you are applying for a job in Bedford? Don't like the weather in California anymore? ;-)

Probably Nashua, but note the use of the word "contracting" instead of "hiring".

There are few weeks in the spring and a few weeks in the fall when it is pleasant to visit New England.

This is why I have been talking about a pool of "threads". One can store the information that client XYZ has a persisted query running in thread agent ABC, but one needs the thread itself to have the state of the query preserved. No, I don't think one can persist the query itself, but there are other kinds of objects which one could persist. If one could send it over the wire, one could persist it.

I think we have a couple of different problem types and correspondingly a couple of different required solution structures.

One problem type is characterized by a table scan query where we want the client to be able to request batches. There we need a thread for the query, but there is no need to go back to the original gateway agent. The first call can start the thread, return the first batch, and then update a table. The second and following calls can go to any agent who can look up the thread in the table and get the next batch.

Another problem type is something inherently long running. We could pass the request to the gateway agent async and the gateway agent could simply run the request, but then that agent is busy for a long time. Better would be for the gateway agent to hand off the task to a thread and go back to being available for work. But, then the thread has to have a way to return its results.

A third problem type might be characterized by "stateless isn't always the right thing". E.g., take a machine with a service for taking orders, but the service for inventory is on another machine. These are linked by an ESB for action messages. But, we don't really want to be sending a Sonic message for every order line to get basic inventory data until we are ready to commit. So, it would be nice to have a local service which would lazy load inventory data and cache it. If a request comes for data it doesn't have yet, it goes to Sonic and gets it, but for information it already has, the request is fulfilled from the cache. One would also want a mechanism to update the cache with information from the response of any commit message. Possibly, this can be accomplished by having a small pool of "inventory data facade" agents with their own data store, but I would like to consider other designs.

Posted by Admin on 13-Nov-2008 13:15

Well - I might be too imfluenced by the term "thread" in the .NET world.

For the designs I tried to implement with the OpenEdge AppServer, I did not require threasds on the appserver at all. I don't mind, that the appserver is multi-processes vs. multi-threaded. It's the context management/handling that was always a challenge.

Here a context object that I could design myself - but that could store references to other objects and also long running queries between batches - would suite the requirements I've had in the past.

Posted by Thomas Mercer-Hursh on 13-Nov-2008 13:39

As you know, I would really like a multi-threaded client since there are some things one just can't fake, but I recognize that is a fairly major rewrite in the AVM. But, I think one could do thread-like things, i.e., use processes as "heavy" threads, probably persistent so there is no startup. It is a service orientation and a smaller scale than with SOA.

Posted by Admin on 14-Nov-2008 03:50

The discussion has been in very much details ...!

Guys, good remarks has come from discussion, in particular :

- For the purpose of performance, we would need object caching on the Appserver side.

-- This cache should maintain active version of object instances in memroy and

may renew its contents periodically (renewall through last time used for eg).

-- And cache management would not be different from any 3GL language,

persisting a TT is like persisting a collection for a given property no more.

- Coming with caching, lazy-load will be a useful function aswell.

-- Loading into memory chunck of data as records come to be required by a given

process can be useful for some processing schemes in production.

- The point regarding multi-threading inside the AVM is indeed another pertinent remark, just for enabling UI activity during long term processes.

-- The principle of a main thread for UI and alternate threads for long term

processes seems to be a good principle.

The discussion regarding multi-threading the AppServer, is relatively technical and the detail may be beyond the scope of my summary. I will post a new message for that.

Posted by Admin on 14-Nov-2008 04:11

Hi Thomas,

Regarding the discussion of multi-threading work with the Appserver.

Regarding this point, I think sending back the ROWID (for last occ.) to the AppS along with the request for new batches is enough, while the object instance is stored/kept inside object cache.

Regarding this point, the thing is just looking at what has been done by other vendors (3GL ones like Microsoft for eg).

For priorities, the first point is a bigger issue for us than the second, which is important but not a serious issue.

Posted by Thomas Mercer-Hursh on 16-Nov-2008 11:42

Regarding this point, I think sending back the ROWID (for last occ.) to the AppS along with the request for new batches is enough, while the object instance is stored/kept inside object cache.

If the object executing the query is saved, then there is no need to send back the ROWID since the object already knows where it left off with the last batch. What is needed is a way to relate the client to the object instance. My inclination would be for this connection to be managed at the AppServer end. I.e., the client sends in a client ID and it is that client ID that is mapped to the query object rather than passing the identity of the query object out to the client. We need the client ID anyway for authentication.

the thing is just looking at what has been done by other vendors

Exactly. Those environments are multithreaded and allow for a substantial number of persistent objects, both stateless and not. There are things that make ABL different from OO3GLs, but this isn't one of them.

Posted by Admin on 16-Nov-2008 14:54

I would remind anyone that instance of objects can be shared by several clients, hence requiring that context (position on collections or TT) should be stored on a connection level.

Posted by Thomas Mercer-Hursh on 16-Nov-2008 15:08

If I had a query object of the full-table-scan-no-index-return-batches variety, how or why would I share an instance of the object (class, yes) with more than one client/connection? For something easily repositioned, yes, but then I don't really need to persist the query from session to session or at least not nearly so much.

This thread is closed