Good practice: always define buffers?

Posted by Nick Vancauwenberghe on 11-May-2017 04:29

Hello,

What code do you like the most, left or right? See screenshot in link below:[View:https://drive.google.com/file/d/0B1P8s6ZKpj7wSnVTLVNONW1laVU:320:50]

https://drive.google.com/a/tvh.com/file/d/0B1P8s6ZKpj7wSnVTLVNONW1laVU/view?usp=drivesdk

Please elaborate why? I'm very interested in the answers.

All Replies

Posted by James Palmer on 11-May-2017 04:56

Apart from the fact that the right hand one won't work (not all references to Company have been updated), I think the left is better. It's good practise to do your updates on a buffer with a strongly scoped transaction block, but using buffers elsewhere isn't necessary.

My opinion of course.

Posted by ChUIMonster on 11-May-2017 07:00

Your opinion is wrong ;)

Because this is an internal procedure you very much want to use a buffer scoped to the IP.  If you just allow "free references" willy nilly you will be elevating the scope of the default buffer to the procedure block and causing unintended side-effects.  Even if all you do is NO-LOCK reads this is a bad thing.

My preferred approach is to include buffer definitions such as:

define buffer customer for customer.

for all of the tables referenced in an internal procedure.  This ensures that no accidental side effects occur due to unintended references to the default buffer.  Of course you should also define a distinct "update" buffer and use strong scope for any updates -- that's a second buffer.

Posted by Brian K. Maher on 11-May-2017 07:04

Ditto on what ChUIMonster said.

Posted by ChUIMonster on 11-May-2017 07:05

BTW -- the "buf" prefix and other junk (like "l") is hideous.   Cluttering up your names with noise about the datatype makes your code unreadable gibberish.

FIND FIRST to find a unique record is also an abomination.  Suppose there are two or more companies meeting your criteria?  Why aren't  you dealing with the rest of them?  Or at least throwing an error that there should only be one?  Why is the one that you just found magic?

Posted by Brian K. Maher on 11-May-2017 07:14

Hi Tom,

FIND FIRST to find a unique record is also an abomination.  Suppose there are two or more companies meeting your criteria?  Why aren't  you dealing with the rest of them?  Or at least throwing an error that there should only be one?  Why is the one that you just found magic?

I disagree with this.  If you know that the records are unique and a FIND will never return more than one record then FIND FIRST will actually save you a wee bit of time because the ABL doesn't have to do its check for duplicates (i.e. the AMBIGUOUS function).  Outside of that corner case you are correct.

Brian

Posted by marian.edu on 11-May-2017 07:20

huh? you only know records are unique if there is a unique index defined (and the damn fields that make it are mandatory and not null)… but then if the developer ‘knows’ that so should the ABL and don’t waste any time to check for anything :)



Marian Edu

Acorn IT 
+40 740 036 212

Posted by Brian K. Maher on 11-May-2017 07:26

Nope.  This is how it works and the info came straight from Mary Szekeley's mouth. By default we do what we need to do to determine if AMBIGUOUS should return true/false and by using the FIRST phrase you are telling us that you don't care about ambiguity so we don't do the work.

By the way, we are talking about nanoseconds here so for most people it won't really matter but for those who need to remove every unneeded nanosecond this will matter.

Posted by ChUIMonster on 11-May-2017 07:27

With all due respect I think you are perpetuating a myth.  There is no check for ambiguity on a unique find -- I have never found an iota of evidence to support that contention.  If the FIND is using a unique index there is no need to check anything -- the very nature of the index ensures that.  If it is *not* a unique index then you should be using a statement that deals with multiple records such as FOR EACH.

Even if I am wrong (I'll happily buy you lunch if you can show that I am) -- the statement is still misleading the next programmer to come along and, in my mind, the downside of misleading code far outweighs any hypothetical minuscule saving of execution time.  FIND FIRST is waving a flag saying that there could be multiple records -- but then doing nothing about the possibility.

Posted by ChUIMonster on 11-May-2017 07:30

BTW, I've talked to Mary about this too -- and I didn't hear what you heard.  Maybe we asked slightly different questions?

Posted by Brian K. Maher on 11-May-2017 07:32

Tom,

It came directly from the person who wrote the code.  I stood in her office while she explained it to me many, many years ago.

Brian

Posted by Brian K. Maher on 11-May-2017 07:35

Tom,

I am sure there are optimizations in the code related to the ambiguity check to make it as efficient as possible so it is likely it can detect the 'perfect scenario' and just flag ambiguity as false but I do know that it is possible that a second read has to be done to satisty the ambiguity check.

Brian

Posted by marian.edu on 11-May-2017 07:38

Well Brian,


if it was many, many years ago maybe it was fixed already otherwise it’s always a good time to do something about it :)

That should be resolved even at compile time, if the runtime really need that ‘assurance’ then make the compiler ‘optimise’ the code but don’t try to justify usage of a bad statement with the limitations of the compiler/runtime :(


Marian Edu

Acorn IT 
+40 740 036 212

Posted by Brian K. Maher on 11-May-2017 07:41

Marian,

It's my understanding that this behavior has never changed (i.e. horse left the barn long ago).

Brian

Posted by Nick Vancauwenberghe on 11-May-2017 07:41

I update the screenshot, I think all references are ok now. Thanks for noticing my quick copy paste mistake.

Posted by ChUIMonster on 11-May-2017 07:59

Like I said, I've asked Mary about it too.  I didn't video the conversation so I have to rely on my pitiful wetware capabilities but my recollection is that she said that a UNIQUE FIND with a UNIQUE index does NOT check for ambiguity.  What would it even check?  Maybe we can invite her to drop in sometime and we can have a spirited discussion :)  Or maybe we can get some of the current engineering people to weigh in on what the current code does.

Posted by Brian K. Maher on 11-May-2017 08:04

See my previous response where I stated that I am sure there are optimizations in the code.  That perfect scenario you are talking about is most likely one of them but I am talking about the code that does not have that perfect scenario.

Posted by Nick Vancauwenberghe on 11-May-2017 08:09

Tom,

Is this a better code snip?

[View:https://drive.google.com/a/tvh.com/file/d/0B1P8s6ZKpj7wNGlRbUhtZ1RxeDg/view?usp=drivesdk:550:50]

The code snip for illustrating "our" preferred coding practices. I'm aware that the data type prefixes are not done according "clean code" (the book :-) ).  And we also made CompanyCode a unique index.

Posted by Nick Vancauwenberghe on 11-May-2017 08:11

Tom,

my goal is to define good naming conventions. What names will you use for your buffers in this case?

Posted by Fernando Souza on 11-May-2017 08:12

Yes, a unique find will try to avoid an ambiguous check. If an unique index is used and it can guarantee uniqueness, there is no check unless unknown values and/or BEGINS is involved. Otherwise, a check is performed.

Posted by Tim Kuehn on 11-May-2017 08:16

Unique indexes are not always unique -  if one or more key fields are ?.

Posted by ChUIMonster on 11-May-2017 09:08

Code that does not have the perfect scenario is an even worse problem... Because there really can be more than one record in that case using FIRST is almost certainly a bug waiting to happen.  

Posted by cverbiest on 11-May-2017 09:43

If I spot "find first" in code without "find next" I always wonder what's wrong with the second, third, ... record.

Why didn't they use find last ?

If the index is unique, and find first is used,  I assume the developer does not know the database schema (s)he is coding against.

Just my 0,02€

Posted by Thomas Mercer-Hursh on 11-May-2017 09:58

Even if there was an infinitesimal performance gain, I would argue against this practice as it confuses the next programmer to work on the code.  The appearance is of non-unique records of which one is getting the first one.  It that were actually done, it would be a violation of normal form.  If the find is actually unique, one is confusing the programmer, who may not have advance knowledge of the related schema.

But, I also oppose define buffer customer for customer.  By using unique names you make it immediately clear where the scope of that buffer is.

Posted by ChUIMonster on 11-May-2017 11:42

In my code I like to use a buffer named the same as the table for all no-lock read access.  I create a buffer prefixed with "upd" for buffers that I intend to update (which includes CREATE statements).

Posted by ChUIMonster on 11-May-2017 11:44

Maybe we should also discuss the correct way to indent IF statements?  ;)

Posted by Brian K. Maher on 11-May-2017 11:49

No indentation.  Ever.  Left aligning all code at column 1 is the only way to go.  There are only so many columns so we need to economize.  <smile & laugh>

Posted by ChUIMonster on 11-May-2017 12:24

Now that that's been sorted...

Posted by Patrick Tingen on 12-May-2017 03:50

I assume that it is also common knowledge that you have a small performance gain if you do not indent. That is because spaces end up in the object file and then the ABL engine has to skip them before it reaches an executable statement. Although small, it /is/ a performance gain. The same is true for short versus long variable names. Short names execute faster.

We stopped indenting and use one-letter variable names only and our code is blazingly fast!

Posted by Nick Vancauwenberghe on 12-May-2017 03:59

[mention:dc2ff39fa15940708b78017f1194db6a:e9ed411860ed4f2ba0265705b8793d05] about the correct way of indenting, I'm really interested.

These kind of best practice style rules, is this documented somewhere? My friend Google couldn't help me.

Posted by Peter Judge on 12-May-2017 09:56

Patrick, you forget the cost of CRLF's .... ONE line for all code runs at ludicrous speed.

Posted by Lars Neumeier on 13-May-2017 10:55

I would always use a FOR FIRST instead of a FIND or FIND FIRST (no need to use NO-ERROR). At least for MSSQL connected databases through the dataserver FOR FIRST include the SELECT ... TOP 1 phrase compared to a simple FOR, which result in a performance benefit.

community.progress.com/.../sql-statement_generated_from_for_buffer_should_contain_top_1

This thread is closed