Searching in a memptr

Posted by jmls on 27-Apr-2012 03:32

Assuming I have a memptr containing a large amount of text data (approx 25mb worth).

this text is comprised of several chunks, each delimited by chr(0)

How can I extract each chunk ?

I cant

1) use get-string as this is a character expression, so craps out after 25k or so

2) use copy-lob to a longchar and then index(longchar,chr(0)) as copy-lob fails because of the embedded chr(0)

I can chunk it by get-string(memptr,start,min(25k,length - start )) , however this is further complicated by the fact that specifying the numbytes to read on the get-string then ignores nulls , so I then have to chunk the chunk which has performance issues.

so, a long winded way of asking, but is there a way of searching for a byte in a memptr ? Like an index() for memptr ?

thanks

All Replies

Posted by gus on 27-Apr-2012 11:50

You can use the C runtime library function strlen() to find the null bytes and index() or memchr() to find other bytes.

Posted by jmls on 28-Apr-2012 03:06

can I pass a memptr straight into these dll calls ?

Mike: is there a .net call that I could use ?

Posted by Admin on 28-Apr-2012 04:07

Mike: is there a .net call that I could use ?

Eventually. But the MEMPTR resides in the AVM's memory space and is not directly accessible by the CLR. You'd have to duplicate it on the .NET side into a System.Byte[] to use .NET functions.

But AFAIK it's possible to pass a MEMPTR to an external procedure

Posted by jmls on 29-Apr-2012 04:00

ok, trying to use memchr, but confused

documentation is as follows


void *memchr(
   const void *buf,
   int c,
   size_t count
);
Parameters





buf
Pointer to buffer.
c
Character to look for.
count
Number of characters to check.

Return Value


If successful, returns a pointer to the first location of c in buf. Otherwise it returns NULL.


Remarks


Looks for the first occurrence of c in the first count bytes of buf. It stops when it finds c or when it has checked the first count bytes.


so, my definition is as follows:

procedure memchr external "msvcrt.dll" CDECL:

  def input parameter p_string as MEMPTR no-undo.

  def input parameter p_char as short no-undo.

  def input parameter p_len as short no-undo.

  def return parameter ReturnPointer as xxxx no-undo.

end procedure.

what should xxxx be ?

Posted by abe.voelker on 29-Apr-2012 09:52

Works on my 32-bit version: https://gist.github.com/2550928

Posted by jmls on 29-Apr-2012 13:11

great answer. Works well for me as well

Thanks very much.

I was nearly there, but didn't get how to work out the positions, forgot about get-pointer-value.

Posted by jmls on 29-Apr-2012 13:13

Oh, I meant to ask - should I set-size(ret_ptr) = 0 when I'm done ?

Posted by jmls on 29-Apr-2012 13:24

A third and final comment:

if mString were to be

"The quick brown dog jumps over the lazy fox"

and I wanted to start searching from position 4 (q) ...

I think that I need to use as handle, with the value passed in calculated by pointer-value

but, as usual, I'm stumped with this c stuff

[edit] now looking at set-pointer-value

Posted by abe.voelker on 29-Apr-2012 13:25

Nope, the return value of that shared library should just be a pointer to a location inside the data structure you passed in to it so no need to free it.  Actually, I think OpenEdge allows you to define the return parameter as a LONG (or INT64 if you're using a 64-bit OS) in this case and you can get the address directly, without having to do the GET-POINTER-VALUE afterwards.  But I did forget to free the mString I allocated myself in my example... whoops, lol.

Posted by Admin on 29-Apr-2012 13:27

Oh, I meant to ask - should I set-size(ret_ptr) = 0 when I'm done ?

Whom else are you expecting to do that for you?

Posted by abe.voelker on 29-Apr-2012 13:32

Yeah to do that you would have to move the pointer forward before calling the external C function. I can change my example to show you how to do that.  I don't think you would want to mutate your existing data structure though, as it might cause issues when you have to free it later.

Posted by jmls on 29-Apr-2012 13:37

[snip]

def var x as memptr.

set-pointer-value(x) = get-pointer-value(mString) + 4.

call memchar with x, not mString

works for me.

Thanks for the pointers (!) , they pushed me in the right direction.

Posted by jmls on 29-Apr-2012 13:38

I wasn;t sure, because the return value is actually pointing to a location within an existing memptr,  not a new memptr

Posted by abe.voelker on 29-Apr-2012 13:44

Yep, that should work.  Also, since you really only care about the addresses anyway you can change the function binding to just use LONG (or INT64 if you're on 64-bit) for pointer addresses and OpenEdge will just return the addresses.  I updated my example with that, and here is the function binding that should save you from having to call GET-POINTER-VALUE:

PROCEDURE memchr EXTERNAL "msvcrt.dll" CDECL:
  DEFINE INPUT PARAMETER str_ptr     AS LONG.
  DEFINE INPUT PARAMETER char_val    AS LONG.
  DEFINE INPUT PARAMETER check_bytes AS UNSIGNED-LONG.
  DEFINE RETURN PARAMETER ret_ptr     AS LONG.
END PROCEDURE.

Posted by abe.voelker on 29-Apr-2012 13:50

If you want to learn more C the absolute best reference is The C Programming Language (and it's not a very big book).  However, after you read it you might not want to write much ABL afterwards

Posted by jmls on 29-Apr-2012 14:01

Heh. let's not go down that road again

Posted by abe.voelker on 29-Apr-2012 14:04

Why not? Religious wars are fun!

Posted by Admin on 29-Apr-2012 14:07

Posted by jmls on 29-Apr-2012 14:20

not when it's a MAD scenario ...

Posted by jmls on 30-Apr-2012 02:00

ok, gotta bite :  *this* is why I find C very messy. This C code is from an open source project , and the bug truncated all sound files from position 0 ...

v1       if ((cur = ftello(fs->f)
v2       if ((cur = ftello(fs->f))

kudos to the guys that found this one.

Posted by gus on 30-Apr-2012 07:47

can I pass a memptr straight into these dll calls ?

 

Sure. Both for Windoze dll's and for UNIX/Linux shared libraries.

Posted by gus on 30-Apr-2012 07:52

memptr

Posted by gus on 30-Apr-2012 08:24

You have to be very careful when calling out to shared libraries and dll's.

One common error is setting size of memptr wrong.

Another source of problems is that the sizes of things are not the same in 64-bit executables as in 32-bit.

Posted by abe.voelker on 30-Apr-2012 13:15

Is there a good way of determining at compile-time in OpenEdge whether the program is being compiled for 32-bit or 64-bit?

Posted by abe.voelker on 30-Apr-2012 13:24

True, C can be like a chainsaw without guards...  I guess it's the price you pay for getting close to the metal.  This is a pretty common mistake to make (conditional evaluation of assignment); it could be caught by enabling compiler warnings (in GCC, -Wall should do the trick).

Posted by gus on 30-Apr-2012 14:55

What matters is that a 32-bit executable cannot call a 64-bit shared library and vice versa. So at runtime you have to call the right version of the library. Off the top of my head, I don't know how to figure out if the OpenEdge executable is 64 or 32 bit. Another issue to worry about is that some of the libraries may be in different places on different operating systems.

Posted by jmls on 03-May-2012 04:07

So, would it be wiser to create my own .c shared library to allow anyone to compile a 32/64 bit dll in linux / unix / hp / whateveros ?

Posted by gus on 03-May-2012 08:47

Probably not a good idea to make your own version of all or part of the C runtime library !

Maybe you should make wrapper procedures and functions .p's that can be tweaked and or compiled as needed.

There is no longer a need for 32-bit code on UNIX or Linux and has not been for many years. That said, I have heard rumours that there are people who are not aware of this and still use 32-bit stuff.

Who is the audience you are aiming for?

Posted by jmls on 03-May-2012 10:02

heh, wasn't looking at building my own runtime

Trying to find which so.dll contains the memchr function using google was impossible.

main(SomeParameters)

if memchr(mystring ) etc etc

seems a little simpler ..

Posted by gus on 03-May-2012 10:32

The man pages tell you where they are. So for memchr() if you

man memchr

you get this:

MEMCHR(3) BSD Library Functions Manual MEMCHR(3)

NAME

memchr -- locate byte in byte string

LIBRARY

Standard C Library (libc, -lc)

SYNOPSIS

#include

void *

memchr(const void *s, int c, size_t n);

...

This thread is closed