Brainstorming: Package and Library-level protection

Posted by bsgruenba on 24-May-2010 14:53

My last thread on statics has spawned this follow-up. I am really hoping we'll get some good discussion on this one because building the Exchange Web Service stuff has shown me how much I need it, and I have a hunch PSC will be working on this for OE 11 so the input may be very valuable.
So let me start out by making the statement that private, protected, and public are not satisfactory enough qualifiers for class members. Bjarne Stroustrup recognized this in C++ when he created friend classes, which most C++ programmers consider the devil's own spawn, kinda like global variables. But most miss the purpose of friend classes. They are intended to allow specific classes access to other classes. The mechanism got abused (like global variables) and got a bad rap.
The issue remains, though, that you need to be able to have classes and members that are accessible to a known set of code without them being publicly available. Anything that is public is an open API that anyone can use. You make it public, and other developers have a right to expect that they can use it, and even if you document it otherwise, it will still get used.
For example, using the OpenClient as an example, PSC never intended for programmers to directly access the Session class. It was expected that programmers would do all the work through the Connection and AppObjects. But the Session class is public and you can get at it in Java through the o4glrt.jar file. This happens in OO all the time. Another classic example is in Eclipse where there are hundreds of classes that exist in packages with the name "internal" that are not intended to be used outside of the .jar file that is deployed, but often are.
Both Java and C# have solutions for this. In Java, if you don't specify a modifier (private/public/protected) for a class or its members, they default to package level. What that means is that only other classes in that package can refer to the class/member. Of course, this does not solve all scenarios, because what happens if there are other classes outside the package that need to refer to it, as is the case with the Session class in o4glrt.jar?
In C#, the lack of a modifier means that anything in the same assembly can reference the class/member. In Java, this would equate to access to all classes in the .jar file. This is the one that Java needs to fix the o4glrt.jar session issue, but C# does not have the ability to secure stuff within a package.
In OOABL, we don't have either of these options, and we really need them. I have found that out while I have been trying to tie down the API design for the EWS stuff.
I believe we need both style of modifier for the ABL, but default the default modifier already means public and that is really the best meaning for it because that's what most people mean when they don't supply a modifier. So I believe we should have two new modifiers added to the language:
  • a package modifier; and
  • a library modifier
The package modifier should work exactly the way the Java default does. Access is permitted from any other object in the package, but nothing outside.
The library modifier should work like the C#  default does. If you bundle your classes as a .pl, only classes in that .pl can access the library classes/methods. If you don't bundle your code in a .pl, library is equivalent to public because there is no way to constrain it.
Thoughts?

All Replies

Posted by Admin on 24-May-2010 15:03

Thoughts?

I'm fan of those (and careful use of friend).

But it would require far more than just the new modifiers in the language. A .PL file would needs to become "sealed" or "signed". Today it's like a ZIP (but I guess without compression). Everybody can extract files from a .PL and add new files to a .PL. That would be against the purpose.

And the compiler needs to become aware of that as well.

Right now a .PL is just part of the language for deployment purposes. With these new modifiers it would become part of a development strategy.

Can the debugger run on .PL code?

Posted by Thomas Mercer-Hursh on 24-May-2010 15:24

It seems to me that neither of these is quite what one wants.

While a library/assembly is a fairly natural unit in C# and Java, it doesn't really have any counterpart in ABL.  Even if one wre able to fix a .pl, it really doesn't play the same role as those structures do and I think would create deployment issues.

Likewise, while package is a natural unit, it isn't at all a secure unit.  Once one declared a member as having package access, anyone can  drop a new component into the package and have access to it.

Friend actually seems the most controlled option although, while it opens the door only to specific classes, it opens the door wide open rather than exposing a single member.

Of course, my first thought is that needing such a thing is really a question of having not worked out the encapsulation properly, i.e., it is a crutch, so I would like to hear about some fairly concrete use cases  in order to:

1) Assure ourselves that there is a valid need; and

2) Explore exactly what that need is..

Posted by Tim Kuehn on 25-May-2010 08:12

It certainly sounds plausible to me.

On the other hand I wonder what  possible down-sides there are to implementing these proposed access control attributes?

Posted by Admin on 25-May-2010 08:54

tamhas schrieb:

and I think would create deployment issues.

Can you tell us more on that?

I personally believe that a more tight integration of .PL files into the compile process would solve many deployment issues!

Today compilation issues/different time stamps (or whatever is used to detect outdated compilation) between classes, base classes, interface files and their consumers are more critical than CRC issues with database tables are (because on OE10 they are mostly history).

Posted by bsgruenba on 25-May-2010 10:00

There have been a number of responses that I want to summarize, and respond to here.

Procedure Library Files (PLs)

First, Mike raised some questions about PL files, and I agree that PL files are limited by the lack of a few features, to wit, compression, the ability to digitally sign them, and (and I think this is true but I have not tested it) the lack of the ability to run the debugger against PL files. Having said all that, PL files are the beginnings of the most effective way of deploying 4GL code. For one thing, there is the notion of shared PLs on the AppServer which significantly improves performance and deployment. As Mike pointed out in a follow-up, they are the most effective mechanism of deploying 4GL code and I would argue that rather than throwing the baby out with the bath water, we look at a separate initiative that focuses on fixing the weaknesses of PL files. They need things like manifests, signatures, obfuscation, and so on.

So, for the rest of this discussion, let's parking lot all the things that we consider weaknesses in PL files in a separate thread and assume that for the sake of the Package and Library-level protection that these are addressed or will be in upcoming releases. Think about it this way: If PSC did nothing to address PL files at all, would you lose anything by not having Library-level protection? I cannot think of anything that would be worse because of it, so I am willing to accept their limitations if the PL weaknesses are never resolved.

Package-level Protection

Thomas raised a very good point that I had not considered when I wrote the initial post. He pointed out that using package level protection did not guarantee much protection if you were able to simply create a package in a separate library and use that package to allow access to the protected methods and classes. I have not had time to try this out in Java, but I think he is correct that Java allows this today.

Having said that, that is not how I envisioned this working with OpenEdge and my reasons were far more pragmatic than theoretical. There are two use cases that factor in:

  1. User has code in procedure libraries. Bear in mind that procedure libraries need to inserted into the propath, and if you allowed package-level protection to extend beyond the scope of a procedure library, there are all kinds of path issues associated with resolving types. So my thought was that package-level protection implied that it be inside a library, too. The fact is that if you build a class with the intent of protecting it within a package, you are really expecting that you *know* what all the classes are that will access it, and anything that is outside the package should not be allowed to do so.
  2. User has code in sub-directories. In this case, the pathing issue is just as much of an issue, so I would suggest that when the compiler tries to resolve the protection, it should verify that the absolute directories of the source code files match. If not, it should not compile.

Whichever way Java does it, it has solved the same problem with CLASSPATH, so we could just look at its example and experiment.

Friend Classes

Both Mike and Thomas indicated that they do not have a problem with friend classes. I do. The problem with friend is that people don't understand how to use them properly and end up creating a nightmare in the process. You end up with code in each source file that takes for granted things about the state of the other object that make it really hard to debug an figure out the issues if it is not properly managed. Friend really does violate encapsulation and, for that reason, I am very much against us using this instead of a more general level of protection like package or library.

At least with package and library you are forced to think in terms of classes of which you have no other knowledge beyond what you have defined as an API for your class. That means that there is no way that your class is sharing another class' context. This makes for much better encapsulated code.

Use Cases

Thomas made the following statement:

Of course, my first thought is that needing such a thing is really a question of having not worked out the encapsulation properly, i.e., it is a crutch, so I would like to hear about some fairly concrete use cases  in order to:

1) Assure ourselves that there is a valid need; and

2) Explore exactly what that need is..

I give up. I thought I did a pretty good job of articulating two cases in my initial post (the Session object in o4glrt.jar, and the internal classes in the Eclipse framework). I am certainly not going to try and work up another contrived example for you to decimate the way you did the code I posted in the "Statics" thread.

But I think of two OpenEdge ISVs that sell commercial applications that should expose public APIs to their applications. Let's pick the idea of an Order Entry module and let's assume you have a class that is part of your publicly exposed API called "Order" that lives in a package called com.bigcorp.oe. This class should be exposed for any other module to call, so the whole thing is public. Create order allows you to create and add line items to the order in the business object. "LineItem" is exposed as a public class. "Order" has a public method on it called "Commit" that allows you to commit the order to the database. "LineItem" has a package-level method that is intended to only be called by members of the same package, such as Order, which commits that line. You should never be able to commit the line outside the scope of the order.

Now, to make this reusable, assume LineItem is also used by SalesOrder. SalesOrder is deployed as part of the same library, but it lives in a different package - say com.bigcorp.so. SalesOrder needs to be able to call Commit on LineItem, too, so instead of making it a package-level method, we change the scope to library and it is available to SalesOrder, but because it is included in a PL file, nothing else can call it.

Off the top of my head, that's the best non-technical use case I can come up with. It's not a bad one, but it's not a great one because I can see holes in the architecture. Point is, this is something that I use all the time in Java and .NET programming.

Posted by Thomas Mercer-Hursh on 25-May-2010 11:09

It seems to me that PLs as they currently exist have a different use than packaging for protection as described here, starting with the fact that one can keep deleting and adding to them after the PL was built.  To be used for library protection, one would end up having hundreds of .pls ... better than thousands of individual files, perhaps, but hardly the same thing as one big PL for the whole app.

Posted by Admin on 25-May-2010 11:33

It would be a completely different model of a PL.

I personally don't see Library protection as for the whole app, more

for per module plus a framework or core module - usually those affect

different teams of developers.

Much like one would setup projects (in Visual Studio at least).

Posted by Thomas Mercer-Hursh on 25-May-2010 11:48

Some clarifications ....

I don't think we can address the idea of library protection without specifying what changes in .pl files we would need to make it accepable.  As they sit, anyone can add and replace anything in .pl in an uncontrolled way and that seems to me to be opening the door wide.  It is not real protection at all.  One might as well make the member public and limit one's use of the public member because that is all the security it is actually providing.

Also, I'm interested in the deployment.  In my legacy app, I have approximately 15000 files organized into 600 directories.  The hierarchical structure of those directories provides the organization into modules and packages ... or, what would be packages in an OO application.  As things stand, it seems I would have to put all 600 package-level pl files in one directory on the PROPATH ... horrors or, slightly better, all the pl files for one application in the module directory and every module directory on the PROPATH ... not much better.  Yes, I realize that I only need a separate PL for places where I am using library protection, but that leads me to a non-uniform deployment strategy subject to accident.  Maybe there is a right way to do this, but I need to know what it is before I can get behind the idea.  Oh, and this implies having to put code into .PLs during development for testing.

I don't see in your discussion how it is that one keeps someone from adding a new object into an existing package and thus gaining access.  At the least, this seems to imply deployment in read-only PLs ... not something that appeals to me personally.

Err, where did I say that I didn't have a problem with friend classes?   I think they have the advantage that only specific other classes are named, but the downside that one is opening the kimono all the way.  I can understand that they might have some uses in frameworks and in debugging, but I'm very, very skeptical that they belong anywhere in application code.

As to use cases ... I'm sorry for being a PITA about this, but I think it is important.  In the static/private thread, I became convinced in the end that class private was the more appropriate model and so I'm willing to support moving in that direction, but I'm still unconvinced that the use case was the correct architecture.  I have that problem here as well.  Perhaps I need to go back and think more about your initial example, but the order and line examples seems to me contrived.  What exactly is SalesOrder as distinct from Order and why would it not be part of the same hierarchy?  And, if we say Order versus Invoice ... they both have lines, after all, and are likely to be at least similar in their data elements and some of their logic  ... but, at the same time the are *not* the same thing, they have very, very different roles in the problem space and so shouldn't be mushed into a single object, even though there is some overlap in functionality.

Public and private are nice and clear. Public is part of the advertized interface and private is none of your business.  Protected might be a little suspicious, but at least it is confined to what ultimately becomes a single compile unit, i.e., it is private to the compile unit and public only to source files in the hierarchy.  Library, Package, and Friend, however, at all forms of violating encapsulation.  It is easy enough to come up with an example of "I just want to tweak this one thing over there" where this violation is needed, but that doesn't mean that there isn't an alternate way of structuring the application where the violation is not needed.  I'm not going to religiously suggest that it is never needed, but I think we all recognize that once one makes such a thing available, it is certain to be abused.

I.e., in the static/private discussion, I became convinced of the rightness of the interpretation, but I still don't actually have a use case because I would use a different architectural approach for the problem stated.  Here, any of these options are an invitation to violate encapsulation, so I would be much happier with some use cases of why that was genuinely necessary before I endorse any of them.

Posted by Thomas Mercer-Hursh on 25-May-2010 11:52

This is my question ... it doesn't seem that the current model for a PL works for the need.  Perhaps it can be made to do so with some extensions, but what are they?  It seems to me that we would actually need a new function which created read-only units which could themselves be embedded in a PL.

Posted by bsgruenba on 25-May-2010 14:28

Maybe I am missing something, but I don't understand how you think that package and library level protection would in any way aggravate the situation with regard to encapsulation. It can only enhance it. It takes methods that would otherwise be public, and limits their accessibility to a smaller set of classes. How does that aggravate an otherwise too permissive encapsulation model?

Posted by Admin on 25-May-2010 14:33

And the same is true for CAREFUL use of friend.

Posted by bsgruenba on 25-May-2010 14:40

I almost agree with you, Mike, except for one thing. Library and package level protection do not allow anything to interact with your class that you do not permit. Friend, as it is implemented in C++, stomps all over the encapsulation model and has the potential to be very badly abused, as is evident in many C++ applications.

If you say Friends with no benefits - ie, restricted friend, and we define what the restrictions are, I think we can get there.

Posted by Thomas Mercer-Hursh on 25-May-2010 15:05

Your contrasting some form of protected versus public seems to imply that you are assuming that the method or property must remain where it is ...but I'm not.  To be sure, if the member *has* to be there, then some form of protected is better than public.  The question is, does it really need to be there or is there some other solution?  My assumption in my own design is that if I end up with a member like this, then I haven't done my decomposition correctly.

Posted by Thomas Mercer-Hursh on 25-May-2010 15:06

Friend on an individual member?

Posted by bsgruenba on 26-May-2010 10:56

Thomas,

With all due respect, and I really mean those words, have you ever written an API that is being published to an audience where you don't know what their implementation is going to be? I'm not asking that question facetiously. I really am asking it because I genuinely want to know.

The reason that question is so important is that when you publish an API, it becomes almost immutable. You cannot change it without affecting people who are already using it, unless your API completely encapsulates behavior and prevents people from interacting with it in ways that you cannot anticipate.Implementations can be and should be refactored as the API evolves, but the published API cannot.

Pragmatism, and time to market demand that sometimes you do things as a short term solution that you plan to change later on. This is especially true when you are publishing a commercial API. If you solidly define the way in which you expect others to interact with you, and only make those mechanisms public, you have a shot at preventing them from interacting with you in ways you do not expect.

OpenEdge does not allow me this ability to evolve without providing me the package/library-level protection that I am asking for.

Bruce

Posted by Thomas Mercer-Hursh on 26-May-2010 11:28

Yes, I have written such APIs.  It is one of the things for which a facade is useful.

Here's an idea for you.  Suppose you create an authorization object.  This can either be an application-wide, general purpose singleton object or it can be a simple per each object.  The object outside the package or library or whatever that wants to use a special  function connects with the authorization object.  If this is application wide it provides the ID for the service it wants to use and such things as a reference to itself, the user token, and whatever you want to include in your authorization scheme.  If it is the simple version, then the handle to the object is provided by the main object and again the client provides what information you want.  By including the reference to the client object in both cases, you can assure that it is a trusted source.  The authorization object hands back a token.  The client then calls the method which you want to protect.  This method is defined public, but the first thing it does is to check the submitted token against the token in the authorization object and throws an error if it doesn't match.  Variations can easily include per call authorization and per session authorization in which a particular client gets authorized for any number of calls on any number of methods.

Yes, this has some overhead, but it is also completely controllable in any stage of development and deployment and easily extended to include any parameters of authorization you might want.

Posted by Mike Ormerod on 26-May-2010 12:52

"The reason that question is so important is that when you publish an API, it becomes almost immutable. You cannot change it without affecting people who are already using it, unless your API completely encapsulates behavior and prevents people from interacting with it in ways that you cannot anticipate.Implementations can be and should be refactored as the API evolves, but the published API cannot."

Which is why in web services you often see versioning of the API, because as you say, once published you can't change it, even more so if it's a public API of any kind, as you may have no real idea who, or how many are actually consuming your service.

Posted by Thomas Mercer-Hursh on 26-May-2010 12:57

The question is, how does the nature of an API being semi-immutable lead to a requirement for semi-private data or behavioral members?  Given that the API should probably be a facade to start with, so it can do whatever it needs to do "under the surface", I don't see the connection.

Posted by Admin on 26-May-2010 13:18

The question is, how does the nature of an API being semi-immutable lead to a requirement for semi-private data or behavioral members?  Given that the API should probably be a facade to start with, so it can do whatever it needs to do "under the surface", I don't see the connection.

The API would need to be public... When it's a façade talking to other objects the actual functional objects would need public APIs as well.

Wouldn't package/library level access be a good utility to hide the actual functionality while it's still accessible from the public API?

Posted by Thomas Mercer-Hursh on 26-May-2010 13:43

I don't see the requirement.  The purpose of an API facade is to present a semi-imutable signature for others to interact with.  It typically isn't to prevent them from accessing the objects behind the facade, if there were some reason to do so ... if they even have a way to get a reference to those objects.  Reaching behind the curtain should, of course get one's hand slapped because one has then made the interface dependent on implementation specifics instead of the published API, but does that really need to be enforced by semi-private methods?  If so, there are certainly various techniques to ensure that an object can only be instantiated in an appropriate context.  That would prevent an "outsider" from creating their own instance of one of the underlaying objects.

Posted by bsgruenba on 26-May-2010 13:47

Slapping a hand is all well and good. What Mike and I are saying is prevent people from being able to get at the implementation-specific stuff to start with. I'm really having a hard time understanding why you see any issue with this at all.

Posted by Thomas Mercer-Hursh on 26-May-2010 14:01

I'm not totally opposed ... although certainly skeptical.  It doesn't seem to me, though, like any of the proposals solve that problem in an effective and desirable way.  Package seems completely vulnerable unless combined with a technology which allows the equivalent of a read-only jar so that you could really tell that some new object was not in the same package really.  Given the whole way PROPATH works at this point, which is to avoid that kind of lock-in, I'm not sure what one would do.  Library is meaningless without read-only or CRCd packages and then, unless you create something new specificaly for that, you goof up someone doing what they do now with PL files, unless you allow embedding a PL within a PL.  And friend exposes the entire implementation, not just selected members.  I just don't see any of those as an improvement.

Posted by bsgruenba on 26-May-2010 14:11

And I think that most of us agree that the existing PL model is inadequate. I don't think there is debate about this. I would contend, though, that modifying/altering a PL file could be constituted reverse engineering, if this is stated in the license for the software, which not only makes the use of the API unsupported, but results in a material breach of the license agreement.

Making the same assertion about APIs that are denoted as public is a harder thing to do. Someone would have to actively reverse-engineer the API in the former example. In the latter example he would merely have to use it.

Posted by Thomas Mercer-Hursh on 26-May-2010 14:28

My sense is that, not only is the existing PL inadequate, it isn't the right tool since it already has a different purpose.

I don't see a meaningful difference in trying to tell someone to not modify a .pl over telling them that they are only allowed to use the public API.  In practical terms, at the least it would put you in the business of compiling and packaging a .PL file for every version and platform instead of letting people just compile for themselves.  If it is compiled code, there aren't very good tools at this point for even discovering how to us the unpublished classes.  If it is source, then any protection is largely based on trust.

Posted by Admin on 26-May-2010 16:28

Exactly.

My experience is that a developer tends to use what is public and / or accessible . No matter how often you tell him he shouldn’t because you intend to modify it.

Who on this thread has never used undocumented Progress keywords or startup parameters (in other words features accessible but not intended for public use)?

I have. And I have to rely on the author (i.e. Progress) to not change them. When they would have been documented (in other words public) Progress would think two or three times before they change them.

Posted by Thomas Mercer-Hursh on 26-May-2010 16:43

The issue is not developer personality ... how did you find out about these keywords and parameters?  Someone blabbed, right?  And then word got around.  And, once you knew, you could use it or not use it, but you had no influence over it.

But here, if I provide a package with an API, one of two situations exist.  One is that it is compiled code and the other is that there is source.  If compiled, how exactly is the developer going to know about these undocumented methods, even if public,  Even if someone blabs or they find out, it should be very clear that the use is unsupported.  If they have source and can easily read about the methods themselves, then you have no protectioni, no matter what you do because they can just change it.

And, what exactly is the motivation?  With the Progress keywords and parameters, it was to access things you couldn't do otherwise.  And here?  If your package is capable of doing something, haven't you as the API developer made the API the easiest way to access the functionality of the package?

And, regardless of what people will try to do, if the protection is easily broken, then it doesn't seem very meaningful to me.

Posted by Admin on 26-May-2010 16:53

The issue is not developer personality ... how did you find out about these keywords and parameters?  Someone blabbed, right?  And then word got around.  And, once you knew, you could use it or not use it, but you had no influence over it.

With OO code nobody needs to "blabber". There's a tool for that: The class browser.

 

And, what exactly is the motivation?  With the Progress keywords and parameters, it was to access things you couldn't do otherwise.  And here?  If your package is capable of doing something, haven't you as the API developer made the API the easiest way to access the functionality of the package?

 

With OO frameworks just relying on public/private/protected there will be public APIs intended to use and those not intended to use. A developer will need to read documentation to find out the details. Who reads documentation? Most developers tend to try out.

I know that's human weakness. Call me an idealist. But I believe that computer software should be used to prevent humans from making mistakes. An effective package or library level access mode would be a very efficient vehicle for that.

It's just the logical next step to strong typing: Strong typing protects you from accessing class members that don't exist. Package or library level access mode would protect you from accessing what you should not access.

Posted by Thomas Mercer-Hursh on 26-May-2010 17:10

Without an effective enforcement mechanism, it seems moot.  The only effective mechanism I have seen described here is Friend and, of course, that only if one doesn't distribute source and it is the ugliest in terms of exposing all of the implementation of one class to another.

Frankly, I'm not really convinced the API example is a valid one either.  Why exactly do I want people to not use the component classes if they are useful?  The reason for the facade is primarily to simplify use, not to prevent people from knowing about the implementation.  What sort of thing to I want to conceal that isn't concealed within the classes themselves?

Posted by Admin on 27-May-2010 00:08

Without an effective enforcement mechanism, it seems moot.

That's exactly why in my first post on this thread I did mention the requirement for a change in the way .PL files are handled.

 

Frankly, I'm not really convinced the API example is a valid one either.  Why exactly do I want people to not use the component classes if they are useful? 

Because I want to be able to decide that I want to change them later without being the bad guy. Maybe if the language evolves - which would be out of my control but I'd like to adopt that somewhere, even in the Interface. I want two of my classes being able to communicate with each other (by the means of invoking methods) without my customer being able to communication with one of the classes.

The reason for the facade is primarily to simplify use, not to prevent people from knowing about the implementation. 

That's why we need another instrument.

But I'm giving up! You have different experience with developers than Bruce and I seem to have. We won't find a common understanding again - so I leave the last word for you.

Posted by Peter Judge on 27-May-2010 08:54

With OO frameworks just relying on public/private/protected there will be public APIs intended to use and those not intended to use. A developer will need to read documentation to find out the details. Who reads documentation? Most developers tend to try out.

I know that's human weakness. Call me an idealist. But I believe that computer software should be used to prevent humans from making mistakes. An effective package or library level access mode would be a very efficient vehicle for that.

It's just the logical next step to strong typing: Strong typing protects you from accessing class members that don't exist. Package or library level access mode would protect you from accessing what you should not access.

People will use things the way they want to, regardless of how they're "supposed" to, or how the feature was intended to be used. We saw this a *lot* with ADM2. It's also true of the ABL, I believe. Documentation doesn't help here, as Mike says: you need to have some form of access control. And allied with that, I also believe that there are 2 approaches you can take here: the first assumes that people will respect your access levels voluntarily; the second, that they must be forced into compliance. In most cases this is more of a philosophical issue than anything else.

And I'll join Mike firmly in the "idealist" camp.

-- peter

Message was edited by: Peter Judge

Posted by bsgruenba on 27-May-2010 11:24

Well I am not an idealist. I'm a pragmatist. I firmly believe that if you give a man enough rope, he will hang you (and no, that's not a typo). I also firmly believe that you should "do unto others before they do unto you.

Net effect: Anything that you do not stop someone from doing, they will do.

Posted by Thomas Mercer-Hursh on 27-May-2010 11:56

I think we have two entirely different issues here and they are getting muddled.

One issue revolves around whether one needs to or wants to control how a library with a defined API gets used, specifically whether use is confined to a defined API and how concerned one is about the developer poking around on the insides and using something there directly.  Being oriented toward open source, I am more concerned about providing people with good models than about controlling their use.  If someone figures out a way to use what I have written to accomplish something more, then more power to them and I hope they contribute feedback so the package can be enhanced to make that use easily available to everyone.  As long as I make it clear that the intended use is through the API, I feel no qualms about changing the implementation behind that API in a later release and letting anyone who has gone behind the curtain figure out what to do about it.  It is a choice they get to make.

I do understand that people who offer commercial software in r-code form only might want more control.  There are, of course, precious few examples of this in the ABL world.  I'm still not sure what the big concern is, as long as you tell people up front that the contract is for the API and they may have to fend for themselves if they go elsewhere.  E.g., people have known all along that if they start muddling around the innerds of ADM that they were going to have some issues moving to a new version, so they should do so carefully.  Not everyone did, of course, but I don't feel the responsibility to force them to better behavior.

But, let's taken it as a given that there is someone who is going to publish r-code only and who has a desire to completely lock down the implementation so that it can only be used through the published API.  Now what?

That brings us to the second issue, i.e., an effective means of implementing such control.

The package approach appears to fall short unless we introduce some kind of read-only container for the package and the package protection detects the install point of the container and limits the access to other modules in the same container.  This presents some difficulties.  First, we need the container and, unless we are going to create a problem in deployment, that container should be capable of being included in an application-wide PL, e.g., if someone wants to do a memory mapped PL.  Second, to do any testing in development, we have to keep building the container.  Third, implementation may be non-trivial since all existing forms of protection are checked at compile time and this form of protection inherently has to be checked at run time.

The library approach has more or less the same issues.  I.e., we need a new container (trying to add additional functionality to the existing PL is not likely to be satisfactory unless we can embed PLs within PLs), we need to keep building the library during development, and the required check is run-time instead of compile-time.

The friend approach seems to have the fewest implementation issues in that it doesn't require a new container, it doesn't require builds during development, and it is protection that can be checked at compile time.  Moreover, it is a very specific one to one permission which doesn't open itself to user additions ... unless, of course, they reverse engineer the friend and replace your component and thus give access.    The big downside of friend is that it is aesthetically obnoxious from an OO point of view because one has exposed the entire inner implementation of one object to another.object.  It would be less obnoxious if it could be implemented on a member by member basis, something like "private except class xyz".

I'm not trying to fight anyone here, just exercising a healthy skepticism about the need, since these are, after all, all violations of encapsulation and failures to separate responsibilities, and pointing out that there is no point wishing for something unless one has a viable plan for the right tool ... that's what Bruce started the thread about, wasn't it?  Brainstorming for a solution?  All I am doing is pointing out the issues with the solutions thus far proposed.

Posted by bsgruenba on 27-May-2010 12:21

The interesting thing about this thread is that the reason that I raised it was precisely because of the OpenSource issue.

OpenSource is not a free-for-all. In fact, with OpenSource it is probably true to say that this issue is actually bigger than anywhere else. Remember - my second example was Eclipse and that was based on work that I did while I was working at Progress on the catalog search for OpenEdge Architect.

The Search functionality in Eclipse was not a publicly exposed API. It was still under development and they were still considering options, but the Eclipse had to have a search. So they had included all of the search functionality in an internal package. Some of it was intended to be exposed later. Some of it was definitely not going to be exposed later, but because the architecture was still under development, things had to be exposed outside the class hierarchy across the .jar file. The net result was that the functionality was exposed publicly because Java did not have assembly/library-level protection.

OpenSource projects are only successful because they are agile. Agile projects almost always have incomplete code in a release. That incomplete code needs to be isolated so that people cannot use it as a public API.

It is a grave mistake to conclude that you can ship a product and not expect that the architecture will grow over time, and with OpenSource, this is more true than anywhere else. So your OpenSource argument actually counters the point of view that you were raising.

Posted by Admin on 27-May-2010 12:23

The library approach has more or less the same issues.  I.e., we need a new container (trying to add additional functionality to the existing PL is not likely to be satisfactory unless we can embed PLs within PLs)

Sorry, I meant not to reply anymore - but that makes no sense. Could you please elaborate why in the ABL we need to pub .PL and .PL when languages like the .NET languages don't need to be able to put assembly into assembly? Are you saying that MS's concept here is better than a not even speced solution from Progress?

Is it actually possible to put a .jar file into a .jar file?

Posted by Thomas Mercer-Hursh on 27-May-2010 12:30

Those languages don't need to nest because a jar file or whatever *is* the unit of deployment.  The standard of deployment is N of these containers, where N is usually a smallish number and each container may come from a different source.  That is not the standard of deployment for ABL, which is either deployed as individual files or as one or more big PL files which have no relationship to the units one would want to protect with package or library protection.  If you overload a PL file with read-only and the like for the purposes of managing the protection and you are going to have people wanting to include them in their application wide PL.  What you really want is a way to "sip" up one or more directory(ies) and replace it with a file that takes the place of that/those directory(ies) and which can be included in the other deployment mechanism, i.e., a PL.

Posted by bsgruenba on 27-May-2010 12:30

jar into jar - You can, because is a zip file, but your classes are signed and there is no notion of library-level protection. I don't believe you call a class that lives in a jar file inside a jar file, but I stand corrected. The only time I have seen a jar file in a jar file is in situations where the included jar contained the source code for the compiled classes.

Posted by Thomas Mercer-Hursh on 27-May-2010 12:40

Only if you assume "open source" and distributing r-code.  If you distribute source, it doesn't really matter what kind of protection you put in the code because someone can change it.

Let's separate the two issues.  In fact, let's drop the issue of whether one wants to have such protection and just accept that some of you obviously do.  Let's also drop the discussion of whether it is desirable on OO principles except to notice things like friend exposing more than one really wants to explose.

Instead, let's get back to your original question of what can we do and does it actually work and accomplish the purposes.  Other than competition for development dollars, I have no objection to you getting some additional protection options, but let's see if there is a real proposal here.  It seems to me that all of the proposals have some complications and limitations.

The only one that I think accomplishes the purpose without excessive exposure based on what has been said up until now is "private except class XYZ".  It's a bit tacky, but it is a compile time check which requires no new packaging technology and which limits the exposure to the members which actually need to be shared.

Posted by Admin on 27-May-2010 12:43

>  What you really want

I'm not sure that you understand what I want.

is a way to "sip" up one or more directory(ies) and replace it with a file that takes the place of that/those directory(ies) and which can be included in the other deployment mechanism, i.e., a PL.

I'm thinking more about generating .PL files from OpenEdge Architect projects.

Posted by Thomas Mercer-Hursh on 27-May-2010 12:53

I'm not sure that you understand what I want.

Do you want something other than for one class to be able to expose members to a limited selection of other closely related classes, however designated. while remaining private with respect to all other classes?

I'm thinking more about generating .PL files from OpenEdge Architect  projects.

Merely generating a .PL for a designated collection would appear to be reasonably trivial. But,

1. PLs would have to be read-only in order to enforce either package or library security.

2. The security would have to recognize that the code for a particular class was coming from within the PL, not somewhere else on the PROPATH.

3. Packaging small units of protected code would have to be rationalized in some way in relationship to the use of a PL as a primary deployment tool for whole applications, including memory mapped PLs.

This seems like a fairly challenging set of requirements, so I'm wondering whether either package or library protection are the best ideas.

Posted by Thomas Mercer-Hursh on 28-May-2010 13:58

Can't one implement the equivalent of method-specific friend protection by defining the method to include having the calling object pass in a reference to itself and then have the method check the type of that object?  I suppose, to be really secure one would have to provide the result via a callback or property set to that object instead of just returning on the method call.

Posted by Thomas Mercer-Hursh on 03-Jun-2010 12:58

In talking about this with someone else, it has been pointed out to me that in a run-time environment, you can only use an object to which you have a reference and there is generally no way to navigate from a facade to the objects behind the facade.  This means there is a question of how it is tha another object is going to get a reference to one of these objects you want to protect.

The obvious answer is to NEW an instance of the object outside of the context of the container, but I would think there are things that one can do to limit that.

I suppose the other possibility is to navigate the session tree to find an object of the desired type, but if the object used callbacks instead of simple method returns, that wouldn't work either.

Posted by bsgruenba on 04-Jun-2010 11:23

Your obvious answer is *the* answer. I'm interested to know what you think the things are that can be done to limit this.

Posted by Thomas Mercer-Hursh on 04-Jun-2010 11:54

Making a successful NEW dependent on the context.  Doesn't seem like very nice OO, though.

One of the things I wonder about here is how necessary this is.  I.e., the requirement seems to derive from a desire to control other programmer's access rather than any actual OO principle.  I mean, suppose I create a library for some function and provide a facade.  I give this to people and tell them that I reserve the right to change the implementation, but I will try really hard not to change the interface via the facade except by expanding it.  I.e., the standard contract one seeks to provide for any object or package.  But, as a part of this library I needed to create some object X which does this cool, handy thing that people might like to use in a context other than the library.  Why am I not publishing the interface to that object as well rather than trying to keep people from using it other than through the facade?

Posted by Thomas Mercer-Hursh on 02-Jul-2010 14:34

Question:  Do protection mechanisms exist to *enforce* access restrictions or do they exist to keep us from making unintended mistakes?

If the former, then there are all kinds of protection that one might want and, in the end, one might even need a security and authorization system to control things.

If the latter, then the requirements for protection are much less.  I.e., if one provides a facade for a subsystem, then it is clear that one is supposed to manipulate the subsystem using the facade and that going directly to the other objects in the subsystem is violating that design.  At some level, one can't prevent people from doing things like this, just like OO doesn't ensure that people will use good design principles.  So, does one actually need to enforce it?

Posted by Thomas Mercer-Hursh on 02-Jul-2010 14:38

Question:  Do protection mechanisms exist to *enforce* access  restrictions or do they exist to keep us from making unintended  mistakes?

If the former, then there are all kinds of  protection that one might want and, in the end, one might even need a  security and authorization system to control things.

If  the latter, then the requirements for protection are much less.  I.e.,  if one provides a facade for a subsystem, then it is clear that one is  supposed to manipulate the subsystem using the facade and that going  directly to the other objects in the subsystem is violating that  design.  At some level, one can't prevent people from doing things like  this, just like OO doesn't ensure that people will use good design  principles.  So, does one actually need to enforce it?

This thread is closed