Some background info: http://www.codinghorror.com/blog/2007/03/primary-keys-ids-versus-guids.html and a thead on the peg (see below the sig).
I'm interested to use the GUID as primary unique key, because of the pro's. What I forgot to mention in the thread on the peg is that a sequential number, f.e. for an ordertable is useful/needed. You can make a PUK (orderid field) and assign a GUID to it, and make the index on the ordernumberfield unique
(and assign it with a sequence). Expect the create of an order to become a bit slower and queries with joins on the GUID fields ditto. But that could be acceptable. Thoughts?
Regards, Stefan.
peg | x |
| Jan 30 ![]() | ![]() ![]() | ||
|
| Jan 30 ![]() | ![]() ![]() | ||
|
| Jan 30 ![]() | ![]() ![]() | ||
|
Jan 30 ![]() | ![]() ![]() | |||
|
| Jan 30 ![]() | ![]() ![]() | ||
|
Jan 30 ![]() | ![]() ![]() | |||
|
Jan 30 ![]() | ![]() ![]() | |||
|
Jan 30 ![]() | ![]() ![]() | |||
|
Jan 30 ![]() | ![]() ![]() | |||
|
Jan 30 ![]() | ![]() ![]() | |||
|
On 01/30/12 14:15, Stefan Houtzager wrote:What would be good reasons to abandon the use of sequencesThe only good reason to use guid as primary key is when you somehow need
and replace them with guid's?
to merge the data into one table - that is, when you essentially have a
distributed database. Other that that, guid's are hard to read/write,
need additional storage
and take more time to generate.
| Jan 30 ![]() | ![]() ![]() | ||
|
| Jan 30 ![]() | ![]() ![]() | ||
|
Jan 31 ![]() | ![]() ![]() | |||
|
Jan 31 ![]() | ![]() ![]() | |||
|
| Feb 1 ![]() | ![]() ![]() | ||
|
Feb 1 ![]() | ![]() ![]() | |||
|
| Feb 1 ![]() | ![]() ![]() | ||
|
Feb 1 ![]() | ![]() ![]() | |||
|
| Feb 1 ![]() | ![]() ![]() | ||
|
| Feb 1 ![]() | ![]() ![]() | ||
|
Feb 1 ![]() | ![]() ![]() | |||
|
Feb 2 ![]() | ![]() ![]() | |||
|
Feb 2 ![]() | ![]() ![]() | |||
|
Feb 2 ![]() | ![]() ![]() | |||
|
Feb 2 ![]() | ![]() ![]() | |||
|
Feb 2 ![]() | ![]() ![]() | |||
|
Feb 2 ![]() | ![]() ![]() | |||
|
Feb 2 ![]() | ![]() ![]() | |||
|
My primary objection against using technical keys is that when there is a functional key available (invoice number, debtor number, g/l account) that any table referencing this data must always join to referential table to be able to show or sort on anything meaningful. Imagine a g/l transaction table with debtor numbers and invoice numbers. The query for showing the results sorted by debtor by invoice number is simply:
FOR EACH gltrans
WHERE ...
BY debtor BY invoice_nr:
...
END.
When using technical keys this becomes:
FOR EACH gltrans
WHERE ...
, debtor
WHERE debtor.guid = gltrans.debtor_guid
, invoice
WHERE invoice.guid = gltrans.invoice_guid
BY debtor.debtor_no BY invoice.invoice_no:
...
END.
I get the impression that result set based databases (SQL Server / Oracle) are very good at doing this. I am not so sure about ABL.
Well, yes, but the minute you want the name or default sign or anything else out of the account, then you have to do the join anyway. The only time you can avoid the join is when the simple account number is all you want and you need no more information.
One of the big attractions of *not* using functional keys is that it is much more forgiving about mistakes. Assign the wrong customer number, create a few dozen transactions, and then fixing your mistake becomes "exciting" if you have used the customer number as the link. But, used a GUID as a link, then one can just change the number and all is well.
The big attraction of GUID over sequence is that GUID is unique to the record across all tables. Sequence is not. To be sure, there is no sensible link in which one would confuse the value across tables, but using GUID it is only going to be a link if it is a valid link.
tamhas wrote:
Well, yes, but the minute you want the name or default sign or anything else out of the account, then you have to do the join anyway. The only time you can avoid the join is when the simple account number is all you want and you need no more information.
I mentioned two issues. For referential data, yes a join is necessary - and I have no problems with that. The problem I do have is with sorting.
tamhas wrote:
One of the big attractions of *not* using functional keys is that it is much more forgiving about mistakes. Assign the wrong customer number, create a few dozen transactions, and then fixing your mistake becomes "exciting" if you have used the customer number as the link. But, used a GUID as a link, then one can just change the number and all is well.
True, but once again when sorting it gets messy. It becomes a balancing act between how often do you need to report and how often do you need to fix mistakes.
For referntial links I am completely fine and can admire the virtue of a technical key. I am however very worried about the performance impact on sorting and selection. In the one key sort you can take your referential data to the outside part of the query and inner join to your transaction table, but sort on two or more referential technical keys and you are stuck with having to scan through all your transactions to get transactions on debtors between 1000 and 4000 and g/l accounts between 8000 and 8070 (for example).
If this concern can be taken away - I would be happy to hear it!
Three of our corporate products, which are SQL Server only, do all this guid joining. But it's SQL Server. It's a result set based database, not a record based database.
Sure, we need complex recoding logic. But recoding is the exception.
Sorting and selection are partly a question of how one roles it...
For starters, one might wonder in what context is a range of account numbers a genuinely interesting thing.
And, of course, there is the direction of the join. If one wants all transactions within a certain account number range, as one might doing the general ledger, for example, one *could* start with the transactions and that would work efficiently only if the account number itself was in the transaction record. But, in a general ledger, one *is* going to want to display things like the account name normal CR/DR etc. so one is going to need the join anyway. Knowing that one needs the join, one can select based on the account table and join efficiently to the transaction table on the GUID.
tamhas wrote:
Sorting and selection are partly a question of how one roles it...
For starters, one might wonder in what context is a range of account numbers a genuinely interesting thing.
Up till know our customers think multiple selections are genuinely interesting, so I do not need to wonder about this. Please show me all g/l transactions on g/l accounts x thru y, project a thru c, cost center e and f, cost categories a and b.
tamhas wrote:
And, of course, there is the direction of the join. If one wants all transactions within a certain account number range, as one might doing the general ledger, for example, one *could* start with the transactions and that would work efficiently only if the account number itself was in the transaction record. But, in a general ledger, one *is* going to want to display things like the account name normal CR/DR etc. so one is going to need the join anyway. Knowing that one needs the join, one can select based on the account table and join efficiently to the transaction table on the GUID.
Which is fine for one selection / sorting criterium as I mentioned in my previous post, but fails miserably for multiple selection / sorting criteria. There is a big difference in joining to simply get additional referential data and sorting on primary data.
Given the context of reporting such as you describe, it is hard for me to imagine how one avoids the join. If one has the join, it is a problem in query optimization.
I find the possibility to assign the PUK on the client (java/.net/progress) a big plus too. But I would be interested in some performanceconsiderations, did anyone of you do some tests?
FWIW: We talked about a GUID as part of the Progress Exchange 2014 "InfoExchange" session. Possible outcome could be a Guid datatype (16 byte binary) to optimize storage. Currently we store a guid in a character string which would be 32bytes (with the dashes removed). There was talk about sequential GUIDs also to optimize using as a database key field.
I'm going to create a Progress "Idea" Enhancement so feel free to vote/comment on that thread.
The time I had a look I found the sequential GUIDs most interesting. For those new to the matter: www.codeproject.com/.../GUIDs-as-fast-primary-keys-under-multiple-database
Mike,
I leave it to others to see your attempt as serious or not (in case not they might not dare to say so). I for myself think you're far too smart to ask this seriously. You try to test my knowledge I think. That's ok, but I think what I can answer (clustered indexes do not exist in the openedge db, some cut and paste comments of others like "Is the slower creation a concern? Personally, I find it acceptable. Unless your application involves very frequent inserts (in which case a GUID key may not be ideal for other reasons), the cost of occasional GUID creation will pale in comparison to the benefits of faster database operations." would not add much to the discussions that have taken place already on the peg and at exchange (I assume). Besides I'm not a db guru. You better discuss this with f.e. Gus and Tom.
According to what is written in the link I sent: www.codeproject.com/.../GUIDs-as-fast-primary-keys-under-multiple-database you can expect the comb guid creation to be a bit slower than the plain guid as we have it now. The clustered indexes would be needed for the real benefits: faster reads. I'll copy a discussion from the peg, you can find more info there. And maybe some who involved in the discussion at exchange want to explain more.
Re: clustered indexesImportant mainly because of your interaction with messages in the conversation.
Click to teach Gmail this conversation is not important.
pegx
Tom BascomOn 6/4/14, 2:39 AM, Stefan Houtzager wrote: > > Has the possibity to mark an ...
Jun 4
On 6/4/14, 2:39 AM, Stefan Houtzager wrote: > > Has the possibity to mark an ...
Tom Bascom
Jun 4
to peg
On 6/4/14, 2:39 AM, Stefan Houtzager wrote:
Has the possibity to mark an index as clustered in a progress db been
considered bij psc by the way?
A clustered index is one where the data is arranged in index order on disk. (The name strikes me as misleading -- "clustered data" seems more appropriate...)
There can only be one such index per table because the data can only be in one "physical" order.
When reading data in the clustered index order you can mostly skip reading the index -- so it is much faster (sort of like the oe11 TABLE-SCAN option except that the data is ordered).
But when writing it can be *very* expensive -- if you insert a record in the middle everything above it has to be shifted.
Personally I don't see the feature as being very attractive. I suspect that it gets a lot of attention because in the Microsoft world they get created automatically and they are a way to do fast table scans. So it is "received wisdom" that clustered indexes are an important feature. Somewhat like the idea that fields should have a known width that is never exceeded ;)
The ordering part is, maybe, occasionally useful but the write-performance trade-off seems far too high to me. I also suspect that a great many table scans either don't care about the ordering or will need a sort order different from such a key. My understanding is that with MS SQL these are often found in association with the "primary" key which in most Progress databases that I have seen is probably not the order that you are likely to want data in. Lastly -- if a GUID is used for the key to cluster around it would likely be an even less useful sort order. Even if the GUID were to be a monotonically increasing sort order the replication/merge use-case, which IMHO is the strongest reason to consider GUIDs as keys, makes the order attribute basically useless.
One Progress-specific issue that immediately springs to mind when thinking about Progress is that RECIDs would become volatile even within a session (not just across dump & load etc). You could read a record, someone could insert another record in the table, and your RECID would no longer point to what it used to point to.
--
Tom Bascom
Reply Reply to all Forward
Rob FitzpatrickFollowing that train of thought re volatile recids, if an insert of one recor...
Jun 4
Following that train of thought re volatile recids, if an insert of one recor...
Rob FitzpatrickJun 4
Loading...
Rob Fitzpatrick
Jun 4
to peg
Following that train of thought re volatile recids, if an insert of one
record caused a change in the recids of all the following records (in
clustered index order) then wouldn't any index entries for those records
in other indexes on that table also require the updated recids? If so I
wonder how SQL Server deals with that. Or are their data structures and
physical storage layout so different from Progress' that the question
isn't meaningful?
-----Original Message-----
From: Tom Bascom
One Progress-specific issue that immediately springs to mind when
thinking about Progress is that RECIDs would become volatile even within
a session (not just across dump & load etc). You could read a record,
someone could insert another record in the table, and your RECID would
no longer point to what it used to point to.
Reply Reply to all Forward
Gus Bjorklundwhat makes you guys think that everything else moves when you add a row to a ...
Jun 4
what makes you guys think that everything else moves when you add a row to a ...
Gus BjorklundJun 4
Loading...
Gus Bjorklund
Jun 4
to peg
On 6/4/14 9:17 AM, "Rob Fitzpatrick" wrote:
>Following that train of thought re volatile recids, if an insert of one
>record caused a change in the recids of all the following records (in
>clustered index order) then wouldn't any index entries for those records
>in other indexes on that table also require the updated recids? If so I
>wonder how SQL Server deals with that. Or are their data structures and
>physical storage layout so different from Progress' that the question
>isn't meaningful?
>
what makes you guys think that everything else moves when you add a row to
a table that has a clustered index ?
-gus
Reply Reply to all Forward
12 older messagesTim KuehnFrom the definition - "data is in index order on disk"
Jun 4
From the definition - "data is in index order on disk"
Tim KuehnJun 4
Loading...
Tim Kuehn
Jun 4
to peg
On Wed, Jun 4, 2014 at 9:45 AM, Gus Bjorklund wrote:
> what makes you guys think that everything else moves when you add a row to
> a table that has a clustered index ?
From the definition - "data is in index order on disk"
--
Tim Kuehn: Senior Consultant - TDK Consulting Services
Ontario PUG President
PUG Challenge Americas Executive, Program Committee Chair
Reply Reply to all Forward
Rob FitzpatrickGus, I don't know how clustered indexes work. That's why I wrote my question ...
Jun 4
Gus, I don't know how clustered indexes work. That's why I wrote my question ...
Rob FitzpatrickJun 4
Loading...
Rob Fitzpatrick
Jun 4
to peg
Gus,
I don't know how clustered indexes work. That's why I wrote my question
in conditional form. If the condition is false then my question is moot
and can be ignored.
-----Original Message-----
what makes you guys think that everything else moves when you add a row
to a table that has a clustered index ?
-gus
Reply Reply to all Forward
DavidMSSQL will leave some gaps, especially when using something like a guid. I gu...
Jun 4
MSSQL will leave some gaps, especially when using something like a guid. I gu...
DavidJun 4
Loading...
DavidMSSQL will leave some gaps, especially when using something like a guid. I gu...
Jun 4
MSSQL will leave some gaps, especially when using something like a guid. I gu...
David
Jun 4
to peg
On 6/4/2014 8:51 AM, Rob Fitzpatrick wrote:
Gus,
I don't know how clustered indexes work. That's why I wrote my question
in conditional form. If the condition is false then my question is moot
and can be ignored.
-----Original Message-----
what makes you guys think that everything else moves when you add a row
to a table that has a clustered index ?
-gus
MSSQL will leave some gaps, especially when using something like a guid. I guess its similar to create/toss limits in Progress. Guids are a poor choice for a clustered key since they are effectively random. Also, the clustered key becomes the implied last component of all other indexes. The ideal clustered key will be a monotonically increasing integer (eg. an identity column or a sequence assigned column)
Reply Reply to all Forward
Tom BascomBecause that's what the Wiki that I found for "clustered indexes" says happen...
Jun 4
Because that's what the Wiki that I found for "clustered indexes" says happen...
Tom BascomJun 4
Loading...
Tom Bascom
Jun 4
to peg
On 6/4/14, 10:12 AM, David wrote:
On 6/4/2014 8:51 AM, Rob Fitzpatrick wrote:
Gus,
I don't know how clustered indexes work. That's why I wrote my question
in conditional form. If the condition is false then my question is moot
and can be ignored.
-----Original Message-----
what makes you guys think that everything else moves when you add a row
to a table that has a clustered index ?
Because that's what the Wiki that I found for "clustered indexes" says happens. Various other material says the same thing.
It passes the sniff test because of the "stored in order on disk" characteristic.
I suppose it would not be a problem if records are always appended but I did say "insert a record in the middle".
-gus
MSSQL will leave some gaps, especially when using something like a guid. I guess its similar to create/toss limits in Progress. Guids are a poor choice for a clustered key since they are effectively random. Also, the clustered key becomes the implied last component of all other indexes. The ideal clustered key will be a monotonically increasing integer (eg. an identity column or a sequence assigned column)
So maybe it doesn't *always* cause everything to move but eventually gaps will get filled and when they do it is going to be really painful.
Sounds like a bad idea to me.
--
Tom Bascom
Reply Reply to all Forward
DavidLogically, the rows are ordered, but not necessarily physically. That is why ...
Jun 4
Logically, the rows are ordered, but not necessarily physically. That is why ...
DavidJun 4
Loading...
David
Jun 4
to peg
Logically, the rows are ordered, but not necessarily physically. That is why there is a need to periodically do a rebuild/reorg of the tables in MSSQL.
On 6/4/2014 9:44 AM, Tom Bascom wrote:
On 6/4/14, 10:12 AM, David wrote:
On 6/4/2014 8:51 AM, Rob Fitzpatrick wrote:
Gus,
I don't know how clustered indexes work. That's why I wrote my question
in conditional form. If the condition is false then my question is moot
and can be ignored.
-----Original Message-----
what makes you guys think that everything else moves when you add a row
to a table that has a clustered index ?
Because that's what the Wiki that I found for "clustered indexes" says happens. Various other material says the same thing.
It passes the sniff test because of the "stored in order on disk" characteristic.
I suppose it would not be a problem if records are always appended but I did say "insert a record in the middle".
-gus
MSSQL will leave some gaps, especially when using something like a guid. I guess its similar to create/toss limits in Progress. Guids are a poor choice for a clustered key since they are effectively random. Also, the clustered key becomes the implied last component of all other indexes. The ideal clustered key will be a monotonically increasing integer (eg. an identity column or a sequence assigned column)
So maybe it doesn't *always* cause everything to move but eventually gaps will get filled and when they do it is going to be really painful.
Sounds like a bad idea to me.
Reply Reply to all Forward
Thomas Mercer-Hursh, Ph.D.Is it, perhaps, the case that clustered index is used only on table which are...
Jun 4
Is it, perhaps, the case that clustered index is used only on table which are...
Tom BascomSorry, you lost me. I would have thought that the index is the "logical" part...
Jun 4
Sorry, you lost me. I would have thought that the index is the "logical" part...
Tom BascomJun 4
Loading...
Tom Bascom
Jun 4
to peg
On 6/4/14, 10:59 AM, David wrote:
Logically, the rows are ordered, but not necessarily physically. That is why there is a need to periodically do a rebuild/reorg of the tables in MSSQL.
Sorry, you lost me. I would have thought that the index is the "logical" part of things and that the ordering in the file is the "on disk" or "physical" portion.
Can you elaborate?
--
Tom Bascom
Reply Reply to all Forward
DavidI can't recall the specifics, but there is some kind of pointer left where th...
Jun 4
I can't recall the specifics, but there is some kind of pointer left where th...
DavidJun 4
Loading...
David
Jun 4
to peg
On 6/4/2014 10:07 AM, Tom Bascom wrote:
On 6/4/14, 10:59 AM, David wrote:
Logically, the rows are ordered, but not necessarily physically. That is why there is a need to periodically do a rebuild/reorg of the tables in MSSQL.
Sorry, you lost me. I would have thought that the index is the "logical" part of things and that the ordering in the file is the "on disk" or "physical" portion.
Can you elaborate?
I can't recall the specifics, but there is some kind of pointer left where the new middle row would logically be. Like a "jump to this block".
Reply Reply to all Forward
Gus Bjorklundthere are a variety of ways to implement storage for clustered index tables. ...
Jun 4
there are a variety of ways to implement storage for clustered index tables. ...
Gus BjorklundJun 4
Loading...
Gus Bjorklund
Jun 4
to peg
On 6/4/14 11:07 AM, "Tom Bascom" wrote:
>On 6/4/14, 10:59 AM, David wrote:
>> Logically, the rows are ordered, but not necessarily physically. That
>> is why there is a need to periodically do a rebuild/reorg of the
>> tables in MSSQL.
>
>Sorry, you lost me. I would have thought that the index is the
>"logical" part of things and that the ordering in the file is the "on
>disk" or "physical" portion.
>
>Can you elaborate?
>
>
there are a variety of ways to implement storage for clustered index
tables. and for handling inserts and page splits. you don't need to
reorganize the whole table for every insert. that would negate whatever
advantage you might get from the clustering.
note that "physical order" does not have much meaning with modern disk and
disk arrays. nobody stores things in disk sector order anymore. clustered
indexes serve to reduce the I/O overhead but don't have to be perfectly
optimized. good enough is good enough.
regards, gus
"Culture eats strategy for breakfast." -- Peter Drucker
Reply Reply to all Forward
Tom BascomThat would seem to defeat the whole "in order on disk" assumption that the be...
Jun 4
That would seem to defeat the whole "in order on disk" assumption that the be...
Tom BascomJun 4
Loading...
Tom Bascom
Jun 4
to peg
On 6/4/14, 11:13 AM, David wrote:
On 6/4/2014 10:07 AM, Tom Bascom wrote:
On 6/4/14, 10:59 AM, David wrote:
Logically, the rows are ordered, but not necessarily physically. That is why there is a need to periodically do a rebuild/reorg of the tables in MSSQL.
Sorry, you lost me. I would have thought that the index is the "logical" part of things and that the ordering in the file is the "on disk" or "physical" portion.
Can you elaborate?
I can't recall the specifics, but there is some kind of pointer left where the new middle row would logically be. Like a "jump to this block".
That would seem to defeat the whole "in order on disk" assumption that the benefits of clustered indexes are supposedly based on.
I guess that if it is a rare event that it might not be too horrible?
--
Tom Bascom
Reply Reply to all Forward
DavidYou have some control over how rare. If you have an ascending numeric key, yo...
Jun 4
You have some control over how rare. If you have an ascending numeric key, yo...
DavidJun 4
Loading...
David
Jun 4
to peg
On 6/4/2014 10:20 AM, Tom Bascom wrote:
On 6/4/14, 11:13 AM, David wrote:
On 6/4/2014 10:07 AM, Tom Bascom wrote:
On 6/4/14, 10:59 AM, David wrote:
Logically, the rows are ordered, but not necessarily physically. That is why there is a need to periodically do a rebuild/reorg of the tables in MSSQL.
Sorry, you lost me. I would have thought that the index is the "logical" part of things and that the ordering in the file is the "on disk" or "physical" portion.
Can you elaborate?
I can't recall the specifics, but there is some kind of pointer left where the new middle row would logically be. Like a "jump to this block".
That would seem to defeat the whole "in order on disk" assumption that the benefits of clustered indexes are supposedly based on.
I guess that if it is a rare event that it might not be too horrible?
You have some control over how rare. If you have an ascending numeric key, you can pack rows pretty tightly. If you have something like a GUID, then you'd set the "fill %" accordingly when you rebuild the index. This underscores why an plain GUID for a clustered key is not a great idea, since you have to have your fill % lower than other alternatives. The article mentioned earlier in the thread describes a way around this by manipulating the GUID to be ordered of sorts.
Reply Reply to all Forward
Stefan Houtzager
Jun 5
to David, peg
Seems like important content in the article I sent (
www.codeproject.com/.../GUIDs-as-fast-primary-keys-under-multiple-database)
is forgotten by some in this thread. "This article outlines an
approach for using GUID values as primary keys/clustered indexes that
avoids most of the normal disadvantages, adapting the COMB model for
sequential GUIDs".
So it is not about the normal guid with a clustered index.
On Wed, Jun 4, 2014 at 5:44 PM, David wrote:
> On 6/4/2014 10:20 AM, Tom Bascom wrote:
>>
>> On 6/4/14, 11:13 AM, David wrote:
>>>
>>> On 6/4/2014 10:07 AM, Tom Bascom wrote:
>>>>
>>>> On 6/4/14, 10:59 AM, David wrote:
>>>>>
>>>>> Logically, the rows are ordered, but not necessarily physically. That
>>>>> is why there is a need to periodically do a rebuild/reorg of the tables in
>>>>> MSSQL.
>>>>
>>>>
>>>> Sorry, you lost me. I would have thought that the index is the
>>>> "logical" part of things and that the ordering in the file is the "on disk"
>>>> or "physical" portion.
>>>>
>>>> Can you elaborate?
>>>>
>>> I can't recall the specifics, but there is some kind of pointer left
>>> where the new middle row would logically be. Like a "jump to this block".
>>
>>
>> That would seem to defeat the whole "in order on disk" assumption that the
>> benefits of clustered indexes are supposedly based on.
>>
>> I guess that if it is a rare event that it might not be too horrible?
>>
> You have some control over how rare. If you have an ascending numeric key,
> you can pack rows pretty tightly. If you have something like a GUID, then
> you'd set the "fill %" accordingly when you rebuild the index. This
> underscores why an plain GUID for a clustered key is not a great idea, since
> you have to have your fill % lower than other alternatives. The article
> mentioned earlier in the thread describes a way around this by manipulating
> the GUID to be ordered of sorts.
>
--
Kind regards,
Stefan Houtzager
Houtzager ICT consultancy & development
www.linkedin.com/in/stefanhoutzager
Reply Reply to all Forward
Tom Bascom
Jun 5
to me, peg
We didn't forget that. It has been mentioned several times. I changed the subject text though because topic of clustered indexes is interesting on its own and comes up once in a while and I wanted to talk about it more generally - not necessarily in relation to guids.
--
Tom Bascom
> On Jun 5, 2014, at 3:05 AM, Stefan Houtzager wrote:
>
> Seems like important content in the article I sent (
> www.codeproject.com/.../GUIDs-as-fast-primary-keys-under-multiple-database)
> is forgotten by some in this thread. "This article outlines an
> approach for using GUID values as primary keys/clustered indexes that
> avoids most of the normal disadvantages, adapting the COMB model for
> sequential GUIDs".
> So it is not about the normal guid with a clustered index.
>
>
>> On Wed, Jun 4, 2014 at 5:44 PM, David wrote:
>>> On 6/4/2014 10:20 AM, Tom Bascom wrote:
>>>
>>>> On 6/4/14, 11:13 AM, David wrote:
>>>>
>>>>> On 6/4/2014 10:07 AM, Tom Bascom wrote:
>>>>>
>>>>>> On 6/4/14, 10:59 AM, David wrote:
>>>>>>
>>>>>> Logically, the rows are ordered, but not necessarily physically. That
>>>>>> is why there is a need to periodically do a rebuild/reorg of the tables in
>>>>>> MSSQL.
>>>>>
>>>>>
>>>>> Sorry, you lost me. I would have thought that the index is the
>>>>> "logical" part of things and that the ordering in the file is the "on disk"
>>>>> or "physical" portion.
>>>>>
>>>>> Can you elaborate?
>>>> I can't recall the specifics, but there is some kind of pointer left
>>>> where the new middle row would logically be. Like a "jump to this block".
>>>
>>>
>>> That would seem to defeat the whole "in order on disk" assumption that the
>>> benefits of clustered indexes are supposedly based on.
>>>
>>> I guess that if it is a rare event that it might not be too horrible?
>> You have some control over how rare. If you have an ascending numeric key,
>> you can pack rows pretty tightly. If you have something like a GUID, then
>> you'd set the "fill %" accordingly when you rebuild the index. This
>> underscores why an plain GUID for a clustered key is not a great idea, since
>> you have to have your fill % lower than other alternatives. The article
>> mentioned earlier in the thread describes a way around this by manipulating
>> the GUID to be ordered of sorts.
>
>
>
> --
> Kind regards,
>
> Stefan Houtzager
>
> Houtzager ICT consultancy & development
>
> www.linkedin.com/in/stefanhoutzager
>
>
Reply Reply to all Forward
Stefan Houtzager
Jun 5
to Tom, peg
Ah, dummy me. David wrote "The article mentioned earlier in the thread
describes a way around this by manipulating the GUID to be ordered of
sorts." indeed.
On Thu, Jun 5, 2014 at 12:53 PM, Tom Bascom wrote:
> We didn't forget that. It has been mentioned several times. I changed the subject text though because topic of clustered indexes is interesting on its own and comes up once in a while and I wanted to talk about it more generally - not necessarily in relation to guids.
>
> --
> Tom Bascom
>
>> On Jun 5, 2014, at 3:05 AM, Stefan Houtzager wrote:
>>
>> Seems like important content in the article I sent (
>> www.codeproject.com/.../GUIDs-as-fast-primary-keys-under-multiple-database)
>> is forgotten by some in this thread. "This article outlines an
>> approach for using GUID values as primary keys/clustered indexes that
>> avoids most of the normal disadvantages, adapting the COMB model for
>> sequential GUIDs".
>> So it is not about the normal guid with a clustered index.
>>
>>
>>> On Wed, Jun 4, 2014 at 5:44 PM, David wrote:
>>>> On 6/4/2014 10:20 AM, Tom Bascom wrote:
>>>>
>>>>> On 6/4/14, 11:13 AM, David wrote:
>>>>>
>>>>>> On 6/4/2014 10:07 AM, Tom Bascom wrote:
>>>>>>
>>>>>>> On 6/4/14, 10:59 AM, David wrote:
>>>>>>>
>>>>>>> Logically, the rows are ordered, but not necessarily physically. That
>>>>>>> is why there is a need to periodically do a rebuild/reorg of the tables in
>>>>>>> MSSQL.
>>>>>>
>>>>>>
>>>>>> Sorry, you lost me. I would have thought that the index is the
>>>>>> "logical" part of things and that the ordering in the file is the "on disk"
>>>>>> or "physical" portion.
>>>>>>
>>>>>> Can you elaborate?
>>>>> I can't recall the specifics, but there is some kind of pointer left
>>>>> where the new middle row would logically be. Like a "jump to this block".
>>>>
>>>>
>>>> That would seem to defeat the whole "in order on disk" assumption that the
>>>> benefits of clustered indexes are supposedly based on.
>>>>
>>>> I guess that if it is a rare event that it might not be too horrible?
>>> You have some control over how rare. If you have an ascending numeric key,
>>> you can pack rows pretty tightly. If you have something like a GUID, then
>>> you'd set the "fill %" accordingly when you rebuild the index. This
>>> underscores why an plain GUID for a clustered key is not a great idea, since
>>> you have to have your fill % lower than other alternatives. The article
>>> mentioned earlier in the thread describes a way around this by manipulating
>>> the GUID to be ordered of sorts.
>>
>>
>>
>> --
>> Kind regards,
>>
>> Stefan Houtzager
>>
>> Houtzager ICT consultancy & development
>>
>> www.linkedin.com/in/stefanhoutzager
>>
>>
--
Kind regards,
Stefan Houtzager
Houtzager ICT consultancy & development
www.linkedin.com/in/stefanhoutzager
Reply Reply to all Forward
Marian Edu
Jun 5
to me, peg
have found this video to be quite helpful on explaining the differences - www.youtube.com/watch
main idea, for a clustered index all record data is available (leaf index structure = table structure) while for non-clustered if values from non-index columns are needed the rowid will be used to get the record from table structure... however I don't think this should require reshuffling whole records when adding an 'out of order' record, meaning that might be one implementation but it's definitively a bad one :)
didn't actually get the idea of the article though, depending on how rows gets stored in the heap using a sorted or random value for the field of clustered index (pk defaults or is mandatory clustered) the time required to insert a record might be higher but that is probably because of the algorithm the engine uses to find the right table structure to put it in and not because it has to rearrange all existing records in some cases.
and beside, imho the performance impact of using guid (string) as pk is more on the read end than on insert/create one
On 06/05/2014 02:09 PM, Stefan Houtzager wrote:
Ah, dummy me. David wrote "The article mentioned earlier in the thread
describes a way around this by manipulating the GUID to be ordered of
sorts." indeed.
On Thu, Jun 5, 2014 at 12:53 PM, Tom Bascom wrote:
We didn't forget that. It has been mentioned several times. I changed the subject text though because topic of clustered indexes is interesting on its own and comes up once in a while and I wanted to talk about it more generally - not necessarily in relation to guids.
--
Tom Bascom
On Jun 5, 2014, at 3:05 AM, Stefan Houtzager wrote:
Seems like important content in the article I sent (
www.codeproject.com/.../GUIDs-as-fast-primary-keys-under-multiple-database)
is forgotten by some in this thread. "This article outlines an
approach for using GUID values as primary keys/clustered indexes that
avoids most of the normal disadvantages, adapting the COMB model for
sequential GUIDs".
So it is not about the normal guid with a clustered index.
On Wed, Jun 4, 2014 at 5:44 PM, David wrote:
On 6/4/2014 10:20 AM, Tom Bascom wrote:
On 6/4/14, 11:13 AM, David wrote:
On 6/4/2014 10:07 AM, Tom Bascom wrote:
On 6/4/14, 10:59 AM, David wrote:
Logically, the rows are ordered, but not necessarily physically. That
is why there is a need to periodically do a rebuild/reorg of the tables in
MSSQL.
Sorry, you lost me. I would have thought that the index is the
"logical" part of things and that the ordering in the file is the "on disk"
or "physical" portion.
Can you elaborate?
I can't recall the specifics, but there is some kind of pointer left
where the new middle row would logically be. Like a "jump to this block".
That would seem to defeat the whole "in order on disk" assumption that the
benefits of clustered indexes are supposedly based on.
I guess that if it is a rare event that it might not be too horrible?
You have some control over how rare. If you have an ascending numeric key,
you can pack rows pretty tightly. If you have something like a GUID, then
you'd set the "fill %" accordingly when you rebuild the index. This
underscores why an plain GUID for a clustered key is not a great idea, since
you have to have your fill % lower than other alternatives. The article
mentioned earlier in the thread describes a way around this by manipulating
the GUID to be ordered of sorts.
--
Kind regards,
Stefan Houtzager
Houtzager ICT consultancy & development
www.linkedin.com/in/stefanhoutzager
--
m.edu
keep it simple
Reply Reply to all Forward
Stefan Houtzager
Jun 5
to Marian, peg
> the performance impact of using guid (string) as pk is more on the read end than on insert/create one
And that's where the 'comb guid' with the clustered index comes in
(see the article).
On Thu, Jun 5, 2014 at 2:38 PM, Marian Edu wrote:
> have found this video to be quite helpful on explaining the differences -
>
> main idea, for a clustered index all record data is available (leaf index
> structure = table structure) while for non-clustered if values from
> non-index columns are needed the rowid will be used to get the record from
> table structure... however I don't think this should require reshuffling
> whole records when adding an 'out of order' record, meaning that might be
> one implementation but it's definitively a bad one :)
>
> didn't actually get the idea of the article though, depending on how rows
> gets stored in the heap using a sorted or random value for the field of
> clustered index (pk defaults or is mandatory clustered) the time required to
> insert a record might be higher but that is probably because of the
> algorithm the engine uses to find the right table structure to put it in and
> not because it has to rearrange all existing records in some cases.
>
> and beside, imho the performance impact of using guid (string) as pk is more
> on the read end than on insert/create one
>
>
> On 06/05/2014 02:09 PM, Stefan Houtzager wrote:
>>
>> Ah, dummy me. David wrote "The article mentioned earlier in the thread
>> describes a way around this by manipulating the GUID to be ordered of
>> sorts." indeed.
>>
>> On Thu, Jun 5, 2014 at 12:53 PM, Tom Bascom wrote:
>>>
>>> We didn't forget that. It has been mentioned several times. I changed
>>> the subject text though because topic of clustered indexes is interesting on
>>> its own and comes up once in a while and I wanted to talk about it more
>>> generally - not necessarily in relation to guids.
>>>
>>> --
>>> Tom Bascom
>>>
>>>> On Jun 5, 2014, at 3:05 AM, Stefan Houtzager
>>>> wrote:
>>>>
>>>> Seems like important content in the article I sent (
>>>>
>>>> www.codeproject.com/.../GUIDs-as-fast-primary-keys-under-multiple-database)
>>>> is forgotten by some in this thread. "This article outlines an
>>>> approach for using GUID values as primary keys/clustered indexes that
>>>> avoids most of the normal disadvantages, adapting the COMB model for
>>>> sequential GUIDs".
>>>> So it is not about the normal guid with a clustered index.
>>>>
>>>>
>>>>> On Wed, Jun 4, 2014 at 5:44 PM, David wrote:
>>>>>>
>>>>>> On 6/4/2014 10:20 AM, Tom Bascom wrote:
>>>>>>
>>>>>>> On 6/4/14, 11:13 AM, David wrote:
>>>>>>>
>>>>>>>> On 6/4/2014 10:07 AM, Tom Bascom wrote:
>>>>>>>>
>>>>>>>>> On 6/4/14, 10:59 AM, David wrote:
>>>>>>>>>
>>>>>>>>> Logically, the rows are ordered, but not necessarily physically.
>>>>>>>>> That
>>>>>>>>> is why there is a need to periodically do a rebuild/reorg of the
>>>>>>>>> tables in
>>>>>>>>> MSSQL.
>>>>>>>>
>>>>>>>>
>>>>>>>> Sorry, you lost me. I would have thought that the index is the
>>>>>>>> "logical" part of things and that the ordering in the file is the
>>>>>>>> "on disk"
>>>>>>>> or "physical" portion.
>>>>>>>>
>>>>>>>> Can you elaborate?
>>>>>>>
>>>>>>> I can't recall the specifics, but there is some kind of pointer left
>>>>>>> where the new middle row would logically be. Like a "jump to this
>>>>>>> block".
>>>>>>
>>>>>>
>>>>>> That would seem to defeat the whole "in order on disk" assumption that
>>>>>> the
>>>>>> benefits of clustered indexes are supposedly based on.
>>>>>>
>>>>>> I guess that if it is a rare event that it might not be too horrible?
>>>>>
>>>>> You have some control over how rare. If you have an ascending numeric
>>>>> key,
>>>>> you can pack rows pretty tightly. If you have something like a GUID,
>>>>> then
>>>>> you'd set the "fill %" accordingly when you rebuild the index. This
>>>>> underscores why an plain GUID for a clustered key is not a great idea,
>>>>> since
>>>>> you have to have your fill % lower than other alternatives. The
>>>>> article
>>>>> mentioned earlier in the thread describes a way around this by
>>>>> manipulating
>>>>> the GUID to be ordered of sorts.
>>>>
>>>>
>>>>
>>>> --
>>>> Kind regards,
>>>>
>>>> Stefan Houtzager
>>>>
>>>> Houtzager ICT consultancy & development
>>>>
>>>> www.linkedin.com/in/stefanhoutzager
>>>>
>>>>
>>
>>
>
>
> --
> m.edu
> keep it simple
> ro.linkedin.com/.../marianedu
--
Kind regards,
Stefan Houtzager
Houtzager ICT consultancy & development
www.linkedin.com/in/stefanhoutzager
Reply Reply to all Forward
Tom Bascom
Jun 5
to peg
Interesting. I don't think I followed everything.
I especially don't understand the point of apparently duplicating the data in both the index and in the table. I'm going to need to go back and watch it again.
On 6/5/14, 8:38 AM, Marian Edu wrote:
have found this video to be quite helpful on explaining the differences - www.youtube.com/watch
main idea, for a clustered index all record data is available (leaf index structure = table structure) while for non-clustered if values from non-index columns are needed the rowid will be used to get the record from table structure... however I don't think this should require reshuffling whole records when adding an 'out of order' record, meaning that might be one implementation but it's definitively a bad one :)
didn't actually get the idea of the article though, depending on how rows gets stored in the heap using a sorted or random value for the field of clustered index (pk defaults or is mandatory clustered) the time required to insert a record might be higher but that is probably because of the algorithm the engine uses to find the right table structure to put it in and not because it has to rearrange all existing records in some cases.
and beside, imho the performance impact of using guid (string) as pk is more on the read end than on insert/create one
On 06/05/2014 02:09 PM, Stefan Houtzager wrote:
Ah, dummy me. David wrote "The article mentioned earlier in the thread
describes a way around this by manipulating the GUID to be ordered of
sorts." indeed.
On Thu, Jun 5, 2014 at 12:53 PM, Tom Bascom wrote:
We didn't forget that. It has been mentioned several times. I changed the subject text though because topic of clustered indexes is interesting on its own and comes up once in a while and I wanted to talk about it more generally - not necessarily in relation to guids.
--
Tom Bascom
On Jun 5, 2014, at 3:05 AM, Stefan Houtzager wrote:
Seems like important content in the article I sent (
www.codeproject.com/.../GUIDs-as-fast-primary-keys-under-multiple-database)
is forgotten by some in this thread. "This article outlines an
approach for using GUID values as primary keys/clustered indexes that
avoids most of the normal disadvantages, adapting the COMB model for
sequential GUIDs".
So it is not about the normal guid with a clustered index.
On Wed, Jun 4, 2014 at 5:44 PM, David wrote:
On 6/4/2014 10:20 AM, Tom Bascom wrote:
On 6/4/14, 11:13 AM, David wrote:
On 6/4/2014 10:07 AM, Tom Bascom wrote:
On 6/4/14, 10:59 AM, David wrote:
Logically, the rows are ordered, but not necessarily physically. That
is why there is a need to periodically do a rebuild/reorg of the tables in
MSSQL.
Sorry, you lost me. I would have thought that the index is the
"logical" part of things and that the ordering in the file is the "on disk"
or "physical" portion.
Can you elaborate?
I can't recall the specifics, but there is some kind of pointer left
where the new middle row would logically be. Like a "jump to this block".
That would seem to defeat the whole "in order on disk" assumption that the
benefits of clustered indexes are supposedly based on.
I guess that if it is a rare event that it might not be too horrible?
You have some control over how rare. If you have an ascending numeric key,
you can pack rows pretty tightly. If you have something like a GUID, then
you'd set the "fill %" accordingly when you rebuild the index. This
underscores why an plain GUID for a clustered key is not a great idea, since
you have to have your fill % lower than other alternatives. The article
mentioned earlier in the thread describes a way around this by manipulating
the GUID to be ordered of sorts.
--
Kind regards,
Stefan Houtzager
Houtzager ICT consultancy & development
www.linkedin.com/in/stefanhoutzager
--
Tom Bascom
Reply Reply to all Forward
Gus Bjorklundmaybe. there are tradeoffs, always these d**n tradeoffs. when multiple users ...
Jun 5
maybe. there are tradeoffs, always these d**n tradeoffs. when multiple users ...
Gus BjorklundJun 5
Loading...
Gus Bjorklund
Jun 5
to peg
On 6/5/14 8:38 AM, "Marian Edu" wrote:
>and beside, imho the performance impact of using guid (string) as pk is
>more on the read end than on insert/create one
>
>
maybe. there are tradeoffs, always these d**n tradeoffs.
when multiple users are making inserts to the same index, if the key
values are monotonically increasing then all the inserts can end up in the
same leaf block of a b-tree. then there will be contention and for locks
on the index block. the transactions serialize and you lose concurrency.
regards, gus
"Culture eats strategy for breakfast." -- Peter Drucker
Reply Reply to all Forward
Marian Edutrue, but my point was in normal circumstances one will read that data more t...
Jun 5
true, but my point was in normal circumstances one will read that data more t...
Marian EduJun 5
Loading...
Marian Edu
Jun 5
to peg
On 06/05/2014 05:29 PM, Gus Bjorklund wrote:
On 6/5/14 8:38 AM, "Marian Edu" wrote:
and beside, imho the performance impact of using guid (string) as pk is
more on the read end than on insert/create one
maybe. there are tradeoffs, always these d**n tradeoffs.
when multiple users are making inserts to the same index, if the key
values are monotonically increasing then all the inserts can end up in the
same leaf block of a b-tree. then there will be contention and for locks
on the index block. the transactions serialize and you lose concurrency.
true, but my point was in normal circumstances one will read that data more than once while there can only be one insert for the lifespan of an 'event'
and that assertion holds true for all indexes, back to those darn trade-offs I guess... how often one insert data in a table, how often data will be read and how will help a designer/dba to put those on balance and try to find the lesser of two evils :)
--
m.edu
keep it simple
mobile: +40 740 036 212
skype: marian.edu
Reply Reply to all Forward
Tom BascomSome data is write-only. (Most implementations of auditing spring to mind...)
Jun 5
Some data is write-only. (Most implementations of auditing spring to mind...)
Tom Bascom
Jun 5
to peg
true, but my point was in normal circumstances one will read that data more than once while there can only be one insert for the lifespan of an 'event'
Some data is write-only.
(Most implementations of auditing spring to mind...)
--
Tom Bascom
Reply Reply to all Forward
Marian Edu
Jun 6
to peg
On 06/05/2014 09:33 PM, Tom Bascom wrote:
true, but my point was in normal circumstances one will read that data more than once while there can only be one insert for the lifespan of an 'event'
Some data is write-only.
(Most implementations of auditing spring to mind...)
yeah, but in those cases one could argue there is no need for a primary-key... the most probably index on date (time-stamp) will lead to theracing condition Gus was mentioned on the b-tree leaf anyway since time flow isn't too random unfortunately, always increasing :)
but that makes me think... a good index must have a good distribution, does this mean a sequence should not be used as a primary key? An always incrementing number (like order number mentioned before) will lead to a skew index and not so great data distribution... well, luckily it's Friday :)
as for clustered index, wouldn't aordinary b-tree index on all table's fields will be the exact equivalent(not considering limitations)... this will of course duplicate the whole table data (table structure and b-tree index)
feel like those kind of thoughts would make a great camp fire subject at pug's/exchange events if kept at a level where developers can follow instead of flag it as `dba only` :)
- what makes a good index
- when enough is enough(don't over index)
- composed key vs. single key index (when, why,will a composed key be enough for a query if all fields required are there or will fetch the whole record from table anyway)
The implementation for OpenEdge is very relevant.
From the article:
[quote]
I could generate a million random GUIDs in 140 milliseconds, but sequential GUIDs took 2800 milliseconds
[/quote]
With 11.4 it takes me 498 milliseconds to generate only 10,000 guids.
def var ii as int no-undo. def var itime as int no-undo. itime = etime. do ii = 1 to 10000: guid. end. message etime - itime view-as alert-box.
You tested the C# solution from within the ABL where Mike asks for?
Seems slow on the surface, but under which circumstances will you generate guid in such a tight loop ?
I get around the same time.
Most pleasantly, if i use an integer counter, it's about the same speed - ~500 ms for both messages in the code below.
def var ii as int no-undo. def var cg as char no-undo. def var itime as int no-undo. itime = mtime. do ii = 1 to 10000: guid. end. message mtime - itime view-as alert-box. itime = mtime. do ii = 1 to 10000: cg = guid. entry(1, cg, '-') = string(ii, '99999999'). end. message mtime - itime view-as alert-box.
@stefan:
498 ms for 10,000.
this is a problem?
In that tight loop you could throw in a couple "if true then." and you would notice a slow down :)
[quote user="gus"]
@stefan:
498 ms for 10,000.
this is a problem?[/quote]
Just an observation.
A tight loop that does nothing one million times takes 225 ms.
A tight loop that generates one million guids takes 50001 ms.
In another language generating one million guids takes 140 ms.
This feels wrong.
[quote user="beginetienne"]
In that tight loop you could throw in a couple "if true then." and you would notice a slow down :)
[/quote]
Three 'if true then .' resulted in three ms extra.(in 1 million loop)
There are many such comparisons possible between ABL and some other language. String operations in particular come to mind. And yet, by and large ABL performs just fine compared to these other languages. In part, some of that comes from having more complex operations like reading and writing the disk built in and highly optimized and in part it comes from comparing things which don't happen or don't matter. E.g., what exactly is the context in which one is generating 10,000 or 1 million GUIDs? Chances are, if it ever happens it is in the context of creating new records in the database ... and so, the overhead of creating a GUID is a meaningless iota in the midst of the processing to obtain and validate the data which is going into the record and writing the record to disk.
[quote user="Stefan Drissen"]
Just an observation.
A tight loop that does nothing one million times takes 225 ms.
A tight loop that generates one million guids takes 50001 ms.
In another language generating one million guids takes 140 ms.
This feels wrong.
Just an observation, "My primary objection against using technical keys" (etc, by Stefan Drissen, 2012 in this thread). When then would what you observed become a problem in practice for you, is it a problem already? I for me would of course like to see the guid faster, you never know which usecase could appear sometime, but I have no problem with it's slowness for now.
And you Mike (or whoever), do you have a concrete problem with this slow guid?
Adding "if true then." slows down on my version of OE much more than 2 ms for 1M iterations. Increment a variable, or casting an integer to string is just as slow as guid().
i = i + 1 is "slow", in comparison to an iteration with no code.
I'd rather have a faster substring() function than guid().
This sounds a lot like another conversation on the PEG (or here don't remember) about class instantiation in ABL vs .NET.
E-
[quote user="agent_008_nl"]
Just an observation, "My primary objection against using technical keys" (etc, by Stefan Drissen, 2012 in this thread). When then would what you observed become a problem in practice for you, is it a problem already? I for me would of course like to see the guid faster, you never know which usecase could appear sometime, but I have no problem with it's slowness for now.
[/quote]
I do not use guids in practice. The theory interested me briefly, being a clean cut case.
I would like the compiler to handle global defined preprocessors more efficiently - but that's another story.
[quote user="beginetienne"]
Adding "if true then." slows down on my version of OE much more than 2 ms for 1M iterations. Increment a variable, or casting an integer to string is just as slow as guid().
i = i + 1 is "slow", in comparison to an iteration with no code.
I'd rather have a faster substring() function than guid().
This sounds a lot like another conversation on the PEG (or here don't remember) about class instantiation in ABL vs .NET.
E-
[/quote]
I'm using prowin.exe on 11.4 x64 on Vista x64 on a five year old i7-920.
For one million iterations:
description | ms |
---|---|
empty loop | 221 |
if true then . | 388 |
today | 847 |
time | 747 |
string( ii ) | 737 |
string( ii, ">>>>>>>>9" ) | 1788 |
substring( "", ii, 1 ) | 588 |
it = it + 1 | 653 |
guid | 50000 |
This is not slower - it is crawling. But again - just an observation.
Right... so I pulled out 11.2.1 32-bit - 1 million guids in 472 ms....
Is the 11.2.1 guid generator broken and functionally fixed at the expense of performance in 11.4 x64?
Update: switched to a Win7 x64 machine:
10.2B08: 50000 ms
11.2.1: 378 ms
11.4 (32-bit): 49999 ms
11.4 (64-bit): 50005 ms
So the 11.2.1 GUID generator seems to have been broken.
Older AIX box, each for 1 million calls of GUID.
10.2A (32 bit) 667ms
10.2B07(32 bit) 621ms
11.3 (64 bit) 670ms
I suspect a Windows call issue of some kind.
beginetienneAdding "if true then." slows down on my version of OE much more than 2 ms for 1M iterations. Increment a variable, or casting an integer to string is just as slow as guid().i = i + 1 is "slow", in comparison to an iteration with no code.I'd rather have a faster substring() function than guid().This sounds a lot like another conversation on the PEG (or here don't remember) about class instantiation in ABL vs .NET.E-
I'm using prowin.exe on 11.4 x64 on Vista x64 on a five year old i7-920.
For one million iterations:
description
|
ms
|
empty loop
|
221
|
if true then .
|
388
|
today
|
847
|
time
|
747
|
string( ii )
|
737
|
string( ii, ">>>>>>>>9" )
|
1788
|
substring( "", ii, 1 )
|
588
|
it = it + 1
|
653
|
guid
|
50000
|
This is not slower - it is crawling. But again - just an observation.
Right... so I pulled out 11.2.1 32-bit - 1 million guids in 472 ms....
Is the 11.2.1 guid generator broken and functionally fixed at the expense of performance in 11.4 x64?
Flag this post as spam/abuse.
[quote user="Aidan Jeffery"]
[/quote]
Thanks for clearing that up! I was rather sure I'd seen this somewhere before, but couldn't find it in the release notes or in the knowledgebase.
[quote user="Stefan Drissen"]
For one million iterations:
description | ms |
---|---|
empty loop | 221 |
if true then . | 388 |
today | 847 |
time | 747 |
string( ii ) | 737 |
string( ii, ">>>>>>>>9" ) | 1788 |
substring( "", ii, 1 ) | 588 |
it = it + 1 | 653 |
guid | 50000 |
[/quote]
I have run the same test just out of curiosity.
OE 10.2B, hp Integrity BL860c i2, B.11.31 ia64
CPU: Intel® Itanium® 9300 series
Memory: 64gb
description | ms |
---|---|
empty loop | 390 |
3x if true then . | 1558 |
today | 3835 |
time | 3774 |
string( ii ) | 1931 |
string( ii, ">>>>>>>>9" ) | 3832 |
substring( "", ii, 1 ) | 1367 |
it = it + 1 (repeat) | 2561 |
guid | 1559 |
Etienne.