Before I head off and do all sorts of stats gathering, I was just wondering if anyone else had done something similar ..
given that I have something that happens on a date (be it financial or actions like "a memo"), and that accumulating totals by client, account, and date, would it be best to store the date
a) as a date with an index
b) as three fields yyyy mm dd with an index on yyyy, an index on mm , and an index on dd
there are millions of records per year, with 25 years+ data
now, if I wanted to get a total of all records in January 2000
a) for each table no-lock where client eq "foo" and date ge 1/1/2000 and date le 31/12/2000
b) for each table no-lock where client eq "foo" and yyyy eq 2000 and mm eq 1
which would be faster, or more efficient ?
* jmls goes off to play with 50m+ records ...
a) will be faster.
You might want to have a compound index on client and date. But I bet these are not the only queries you are going to perform, I bet the story has more to it.
if date were a datetime would that make a difference ?
And, of course, you are right. There are several queries that will be excecuted, some of them not involving the client ..
Well, you said "date" but I was actually thinking "datetime". Since the
datetime datatype combines date and time it will be faster than having
separate date and time fields.
However, if you don't have time in your existing data (and you would likely
use 00:00:00.000 for the time part), datetime will be marginally slower than
a date field.
-gus
On 12/19/11 2:07 PM, "progresscommunities"
gus wrote:
Well, you said "date" but I was actually thinking "datetime". Since the datetime datatype combines date and time it will be faster than having separate date and time fields.
This structure'll work if one exclusively does date-time comparisons, it's a bit harder to use when one wants to do date-specific comparisons / lookups. I've had situations where I need both/and, so I have to have both a date-time and a date field in a db table along with corresponding indexes.
Can you give an example for why you need both?
if I try to do something like this:
DEFINE TEMP-TABLE tt NO-UNDO
FIELD dt-field AS DATE
FIELD dtm-field AS DATETIME
INDEX i-dt dt-field
INDEX i-dtm dtm-field
.
FOR EACH tt
WHERE tt.dtm-field = TODAY
NO-LOCK:
DISPLAY tt
WITH FRAME f2
DOWN.
END.
I get:
+---------------------------- Error ----------------------------+
¦ ** Incompatible data types in expression or assignment. (223) ¦
¦ ** date-time.p Could not understand line 22. (196) ¦
¦ ¦
¦ ------------------------------------------------------------- ¦
¦ ¦
+---------------------------------------------------------------+
In order to do date (range) matching against a date-time field, I'd need to do something like:
WHERE tt.dtm-field >= DATETIME(cur-date, 0) AND tt.dtm-field
which is a pain in the patootie, particularly when automated query generators are involved.
Ah! I get it.
This is where you might want to do something like
WHERE DATE(tt.tdm-field) EQ TODAY
This syntax works today (in 11 at least) but I would not expect it to perform particularly well because of the LHS-function. But I would think that this would be a nice thing to have and it might be worthwhile reporting as a bug/enhancement.
-- peter
arrgh! please, please do not encourage this by mentioning it !!.
Progress should not compile this statement.
Mind you, people like Dan and Tom will be rubbing their hands with
glee in anticipation of the performance issues coming down the line
jmls wrote:
arrgh! please, please do not encourage this by mentioning it !!.
Progress should not compile this statement.
Mind you, people like Dan and Tom will be rubbing their hands with
glee in anticipation of the performance issues coming down the line
Should not compile this specific statement - ie the DATE - or functions-in-WHERE-clauses in general?
You can make an argument that DATE and DATETIME are wideing cases, and so should be treated similarly to CHAR / LONGCHAR and INT / INT64. I think in these cases there should be some flexibility without the performance hit. Adding functions (ABL or UDF) willy-nilly into a WHERE clause is not a good idea at all.
An alternative might be to extent the BEGINS statement to allow parts of datetime values on datetime fields:
WHERE tt.dt-field BEGINS TODAY
or
WHERE tt.dt-field BEGINS DATETIME(TODAY, 12, 00, 00).
The use of BEGINS semantically bight be a little odd, but I think you get my drift.
-- peter
lhs functions in where clauses. Or at least, get the compiler to throw
out a bloody great "Oi! do you want your query to run as slow as a
one-legged drunken kangaroo"
jmls wrote:
lhs functions in where clauses. Or at least, get the compiler to throw
out a bloody great "Oi! do you want your query to run as slow as a
one-legged drunken kangaroo"
Do you claim copyright over that message text, or can we use it as-is?
-- peter
Creative Commons. Use as you see fit .. .. Unless we get an
objection from our Antipodean friends ..
Don't forget to add a comment to any such code:
...
where LHS( somefield ) = something /* improved performance! */
...
That way I'll have no trouble finding it when it is time to fix it
pjudge wrote:
This is where you might want to do something like
WHERE DATE(tt.tdm-field) EQ TODAY
This syntax works today (in 11 at least) but I would not expect it to perform particularly well because of the LHS-function. But I would think that this would be a nice thing to have and it might be worthwhile reporting as a bug/enhancement.
-- peter
Absofrickinloutely NOT. I have no interest in doing things that may result in increased business for Tom!