Back to the basics. Fundamental index question

Posted by OctavioOlguin on 10-Mar-2015 12:36

I'm doing a data warehouse database, I intend to load with massive amounts of data (for my current system, that would be 5 gb database at most ;)

I'm facing a decision I can't be certain that I have all the knowledge in one hand.

I have this table:

Packet:
   warehouse  int
   code            dec
   entryDate    dat
   exitDat        dat
   weight         dec     ...  etc

I need to answer (fast) some question like this:

1)  wich packets from warehouse 1 were checkeout  sometime between date1 and date2

2) idem for checking in.

3) give me all the packets that were in motion between date1 and date2 (query1: packets in  -and query2: packets out) from all warehouses or one in particular.

question 1, takes me to this index:  (warehouse, entrydate), wich I don't think that allows a easy bracketing for 3rd. question

q2: same as 1 for existdate.

question 3, considering these indexes:  (entrydate, warehouse) and (exitdate, warehouse)

So I wonder, do i have to make index(x,y) and index(y,x) on each field pair (exitdate, wh)  & (wh, exitdate) and the same for entrydate?

All Replies

Posted by S33 on 10-Mar-2015 13:26

Assuming that you have, or will make, a warehouse table, your indexes (warehouse, entrydate) and (warehouse, exitdate) could be used for q3, if you think of it as:

for each warehouse where <blah-blah> no-lock,

   each packet of warehouse where entrydate >= mydate1 and exitdate <= mydate2

Posted by OctavioOlguin on 10-Mar-2015 19:53

Yes, indeed... haven't tought of that......  good!...

Thanks,

Posted by cverbiest on 11-Mar-2015 05:33

I would replace the date fields with datetime

Posted by S33 on 11-Mar-2015 09:45

I'm also assuming the number of warehouses is small, relative to the number of packets, but that seems a realistic real-world assumption...

Posted by OctavioOlguin on 28-Mar-2015 10:39

Indeed, warehouses around 20 and packets by the thousands...

I 'm putting to work what I learned form you, but brings up one question , that involves compiler and avm runtime I guess..

DO iPos = 1 TO NUM-ENTRIES(pcSucursales):

   FOR EACH dwP1 NO-LOCK

       WHERE dwP1.Sucursal = integer(ENTRY(iPos, pcSucursales))

       AND   dwP1.Fecha >= pdFecha1

       AND   dwP1.Fecha <= pdFecha2:    

   END.

END.

The num-entries, is evaluated every loop?

the integer(ENTRY( is evaluated for every pack?

should I pass to a variable before the loops the values that make the conditions?

Thanks

Posted by Abri Venter on 30-Mar-2015 01:15

Hallo

If you wanted you could also use:

   DO iPos = NUM-ENTRIES(psSucursales) TO 1 BY -1

This would only evaluate the NUM-ENTRIES once.

This would only be possible if the order you retrieve data does not matter.

Thank you

A Venter

Posted by Peter Judge on 30-Mar-2015 08:19

I believe this evaluates the NUM-ENTRIES() function (or any function) on each iteration.
 
Check the below:
function counter returns integer(input pi as int):
    message 'in function' pi
view-as alert-box.
    return 5.
end.
 
def var i as int.
do i = 1 to counter(i):
end.
 
 
-- peter
 
[collapse]
From: Abri Venter [mailto:bounce-abrivaigscoza@community.progress.com]
Sent: Monday, 30 March, 2015 02:16
To: TU.OE.RDBMS@community.progress.com
Subject: RE: [Technical Users - OE RDBMS] Back to the basics. Fundamental index question1
 
Reply by Abri Venter

Hallo

If you wanted you could also use:

   DO iPos = NUM-ENTRIES(psSucursales) TO 1 BY -1

This would only evaluate the NUM-ENTRIES once.

This would only be possible if the order you retrieve data does not matter.

Thank you

A Venter

Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse]

Posted by Simon L. Prinsloo on 30-Mar-2015 08:59

Functions in the FOR block's criteria are evaluated only once before the first iteration.

Functions in Counters fire once for the start value and once more than the number of iterations for the limit.

DEFINE VAR iA AS INT NO-UNDO.
DEFINE VAR iB AS INT NO-UNDO.


FUNCTION FunctionA RETURNS INTEGER():
    iA = iA + 1.
    RETURN 5.
END.

FUNCTION FunctionB RETURNS INTEGER():
    iB = iB + 1.
    RETURN 3.
END.

DEF VAR i AS INTEGER.
DO i =  1 TO FunctionA():

END.
MESSAGE "Done with i =  1 TO FunctionA()" SKIP
        "Function fired" iA "times"         
    VIEW-AS ALERT-BOX.
iA = 0.
DO i = FunctionA() TO 1 BY -1:

END.
MESSAGE "Done with i = FunctionA() TO 1 BY -1:" SKIP
        "Function fired" iA "times"         
    VIEW-AS ALERT-BOX.

iA =  0.
   
FOR EACH Customer WHERE CustNum <= FunctionA()
    i = 1 TO FunctionB():
   DISP CustNum Name.
END.

MESSAGE "Done with FOR EACH" SKIP
        "Filter Function fired" iA "times" SKIP
        "Counter Function fired" iB "times"
    VIEW-AS ALERT-BOX.


Posted by S33 on 30-Mar-2015 10:55

That's a neat trick, Abri; I never knew that.

So for Octavio the answer is that num-entries fires on every iteration unless using Abri's "by -1" approach, and the integer(entry( function fires once at the start of the for each block (aka once per warehouse). So, given the small handful of warehouses, you're not wasting very many milliseconds with the code you have: just whatever time it takes to do 20 "extra" num-entries calls.

Here's a tougher question: If the value of iPos or pcSucursales could change in the body of the for each loop (bad programming to say the least), would the compiler know that and re-evaluate integer(entry re-iteratively?

I know you can retrieve a record more than once:

for each table where id > 1000 by id: id = id + 2000. end.

Jim Shepherd

This thread is closed