Map ascii characters to valid html

Posted by martinz on 17-Aug-2016 03:37

Hi all,

I need to convert valid ascii characters (like &, < and >) to valid html (&amp;, &lt; and &gt). Is there an easy way to do this in ABL? Or does anyone want to share a mapping function?

We're on OE 11.5. Just to filter possible options: needs to work on Linux, so (afaik) no .Net.

- Martin

Posted by tbergman on 17-Aug-2016 07:15

While I use .Net now, here's an older version. I think this came from some of the Progress webspeed code but I'm not certain.

FUNCTION htmlEncode RETURNS CHARACTER
  (INPUT p_in AS CHARACTER):
  /****************************************************************************
  Description: Converts various ASCII characters to their HTML representation
    to prevent problems with invalid HTML.  This procedure can only be called
    once on a string or ampersands will incorrectly be replaced with "&amp; .
  Input Parameter: Character string to encode
  Returns: Encoded character string
  ****************************************************************************/
  
  
/* Ampersand must be replaced first or the output will be hosed if done
   after any of these other subsititutions. */
  ASSIGN
    p_in = REPLACE(p_in, "&":U, "&amp~;":U)       /* ampersand */
    p_in = REPLACE(p_in, "~"":U, "&quot~;":U)     /* quote */
    p_in = REPLACE(p_in, "<":U, "&lt~;":U)        /* < */
    p_in = REPLACE(p_in, ">":U, "&gt~;":U).       /* > */
  RETURN p_in.
END FUNCTION. /* html-encode */

Posted by Marc Fellman on 17-Aug-2016 08:22

That will also be solved by using the SAX or DOM parser.

All Replies

Posted by Marc Fellman on 17-Aug-2016 06:09

If you use the SAX or DOM parer to generate the XML then this is no issue. These parsers take care of that.

Posted by tbergman on 17-Aug-2016 07:15

While I use .Net now, here's an older version. I think this came from some of the Progress webspeed code but I'm not certain.

FUNCTION htmlEncode RETURNS CHARACTER
  (INPUT p_in AS CHARACTER):
  /****************************************************************************
  Description: Converts various ASCII characters to their HTML representation
    to prevent problems with invalid HTML.  This procedure can only be called
    once on a string or ampersands will incorrectly be replaced with "&amp; .
  Input Parameter: Character string to encode
  Returns: Encoded character string
  ****************************************************************************/
  
  
/* Ampersand must be replaced first or the output will be hosed if done
   after any of these other subsititutions. */
  ASSIGN
    p_in = REPLACE(p_in, "&":U, "&amp~;":U)       /* ampersand */
    p_in = REPLACE(p_in, "~"":U, "&quot~;":U)     /* quote */
    p_in = REPLACE(p_in, "<":U, "&lt~;":U)        /* < */
    p_in = REPLACE(p_in, ">":U, "&gt~;":U).       /* > */
  RETURN p_in.
END FUNCTION. /* html-encode */

Posted by martinz on 17-Aug-2016 07:25

This will probably work just fine. Thanks!

Posted by DimitriG4 on 17-Aug-2016 07:39

...and you might want to throw in these as well  in case you have users using languages other than English

<snip>

/* take care of foreign language characters; list found at webdesign.about.com/.../blhtmlcodes-sp.htm */
    p_in = REPLACE(p_in,CHR(193),"%C1").   /* take care of Capital A-Acute  */
    p_in = REPLACE(p_in,CHR(225),"%E1").   /* take care of lowercase a-acute   */
    p_in = REPLACE(p_in,CHR(201),"%C9").   /* take care of Capital E-acute     */
    p_in = REPLACE(p_in,CHR(233),"%E9").   /* take care of Lowercase e-acute   */
    p_in = REPLACE(p_in,CHR(205),"%CD").   /* take care of Capital I-acute     */
    p_in = REPLACE(p_in,CHR(237),"%ED").   /* take care of Lowercase i-acute   */
    p_in = REPLACE(p_in,CHR(209),"%D1").   /* take care of Capital N-tilde     */
    p_in = REPLACE(p_in,CHR(241),"%F1").   /* take care of Lowercase n-tilde   */
    p_in = REPLACE(p_in,CHR(211),"%D3").   /* take care of Capital O-acute     */
    p_in = REPLACE(p_in,CHR(243),"%F3").   /* take care of Lowercase o-acute   */
    p_in = REPLACE(p_in,CHR(218),"%DA").   /* take care of Capital U-acute     */
    p_in = REPLACE(p_in,CHR(250),"%FA").   /* take care of Lowercase u-acute   */
    p_in = REPLACE(p_in,CHR(220),"%DC").   /* take care of Capital U-umlaut    */
    p_in = REPLACE(p_in,CHR(252),"%FC").   /* take care of Lowercase u-umlaut  */
    p_in = REPLACE(p_in,CHR(171),"%AB").   /* take care of Left angle quotes   */
    p_in = REPLACE(p_in,CHR(187),"%BB").   /* take care of Right angle quotes  */
    p_in = REPLACE(p_in,CHR(191),"%BF").   /* take care of Inverted question mark     */
    p_in = REPLACE(p_in,CHR(161),"%A1").   /* take care of Inverted exclamation point */
    p_in = REPLACE(p_in,CHR(128),"%80").   /* take care of Euro   */

<snip>

Posted by Marc Fellman on 17-Aug-2016 08:22

That will also be solved by using the SAX or DOM parser.

This thread is closed