Export to Excel problems: A Follow up

I posted a while ago about how we modified the Export to Excel function to clean up some of our data on export, so Excel interprets it correctly. The problem addressed in that post is the fact that AX does not properly escape data (particularly double quotes) prior to sending it to Excel.

The solution in question basically intercepts the dataset generated by AX prior to sending it to Excel, runs it through a regex that replaces problem characters (in this case, double quotes) with a properly-escaped version (two double quotes). and then proceed to send the now-escaped data to Excel as usual.

This system works fine except for one situation: exceptionally large data sets take an enormous amount of memory to process, and tend to throw an OutOfMemoryException. In our situation, we tend to run into this with exports of unfiltered Item Details and broadly-filtered Production Orders. Since we need this data more often than one would think (especially the Item Details), we end up going into one of our updated test environments, disable the modification, and export the data from there. This is obviously cumbersome not only for our users, but also for our administrators who need to do the actual export.

Additionally, the modification did not address any other issues that tend to appear when dealing with Excel from AX (Note these issues aren’t directly tied to the Export to Excel function). In particular, we have come across some examples where newline characters and tab stops are added into various non-memo text boxes, like the Item Number and Item Description fields. If a report is ran which is then opened with Excel, these tab stops and newlines are preserved, throwing off the alignment of reports (in one case, we found an item description with 80 newline characters in front of it, pushing down the entire report 80 lines before any more data was displayed). Because these characters are non-printing, nobody notices them until reports that contain them are ran and someone notices the gaps. Even once they’re identified, most users do not know how to fix them because they are non-printing characters.

So, to help combat these issues, we decided to implement a system that would sanitize input as it is being entered. Because each value would be checked as it was entered, we no longer need to worry about the massive memory requirements. The system automatically removes the characters or changes them to more noticeable characters, like spaces, depending on where in the string they fall. We’ve also created it to be applied at a full table level, making it easy to apply to the places that need the most help, and being effective without any changes if a new field is added. Because some fields should be allowed to have these characters, we also added a way to exclude specific fields from being processed without compromising the functionality on the rest of the table.

First, the class we are using:

SysInputSanitation

class SysInputSanitation
{
    Set     excludedFields;
}

public static Common sanitize(Common record, FieldId _fieldId)
{
    SysInputSanitation  ths = new SysInputSanitation();
    ;

    return ths.doSanitize(record, _fieldId);
}

public str beginTrim(str input, container character)
{
    str     seekStr = conPeek(character, 1),
            repStr  = conPeek(character, 2);
    int     len     = strlen(seekStr),
            inc     = strlen(seekStr);
    ;

    while (substr(input, 1, len) == strRep(seekStr, len / inc))
    {
        len += inc;
    }

    len -= inc;

    return strfmt("%1%2",
                strRep(repStr, len / inc),
                substr(input, len + 1, strlen(input)));
}

public Common doSanitize(Common record, FieldId _fieldId)
{
    str             orgVal,
                    retVal,
                    seekStr,
                    inRepStr;
    List            excludedChars = this.excludedCharacters(record.TableId);
    ListEnumerator  listEnum;
    ;

    if (!excludedChars ||
        typeOf(record.(_fieldId)) != Types::String ||
        this.isExcludedField(record.TableId, _fieldId))
    {
        return record;
    }

    orgVal = record.(_fieldId);
    retVal = orgVal;

    listEnum = excludedChars.getEnumerator();

    while (listEnum.moveNext())
    {
        seekStr     = conpeek(listEnum.current(), 1);
        inRepStr    = conpeek(listEnum.current(), 3);
    
        if (strFind(retVal, seekStr, 1, strLen(retVal)) > 0)
        {
            retVal = this.beginTrim(retVal, listEnum.current());  //Trim (remove) the current character from the beginning of the string
            retVal = this.endTrim  (retVal, listEnum.current());  //Trim (remove) the current character from the end of the string
            retVal = strReplace(retVal, seekStr, inRepStr);       //Replace any other occurences of the string
        }
    }

    record.(_fieldId) = retVal;

    return record;
}

public str endTrim(str input, container character)
{
    str     seekStr = conPeek(character, 1),
            repStr  = conPeek(character, 2);
    int     len     = strlen(seekStr),
            inc     = strlen(seekStr);

    str     a, b, c;
    ;

    a = substr(input, strlen(input) - len, len);
    b = strRep(seekStr, len / inc);

    while (substr(input, strlen(input) - len + 1, len) == strRep(seekStr, len / inc))
    {
        len += inc;
    }
    
    len -= inc;
    
    return strfmt("%1%2",
                substr(input, 1, strlen(input) - len),
                strRep(repStr, len / inc));
}

protected List excludedCharacters(tableId _tableId)
{
    DictTable   dictTable           = new DictTable(_tableId);
    List        excludedCharacters;
    ;

    if (tableHasStaticMethod(dictTable, identifierStr("SanitizeCharacters")))
    {
        excludedCharacters = dictTable.callStatic("SanitizeCharacters");
        return excludedCharacters;
    }

    excludedCharacters = new List(Types::Container);

    //Container mappings are as follows:
    //[0] = Character(s) to replace
    //[1] = Replacement value if character(s) are found at the beginning or end of the string
    //[2] = Replacement value if character(s) are not at the beginning or end of the string
    excludedCharacters.addEnd(["\r\n", ""  , " " ]);
    excludedCharacters.addEnd(["\n\r", ""  , " " ]);
    excludedCharacters.addEnd(["\n"  , ""  , " " ]);
    excludedCharacters.addEnd(["\r"  , ""  , " " ]);
    excludedCharacters.addEnd(["\t"  , ""  , " " ]);
    excludedCharacters.addEnd(["\""  , "\"", "''"]);

    return excludedCharacters;
}

boolean isExcludedField(tableId _tableId, fieldId _fieldId)
{
    DictTable   dictTable = new DictTable(_tableId);
    Set         excludedFields;
    ;

    if (!tableHasStaticMethod(dictTable, identifierStr("ExcludeSanitizeFields")))
        return false;
        
    excludedFields = dictTable.callStatic("ExcludeSanitizeFields");

    if (excludedFields.in(_fieldId))
    {
        return true;
    }

    return false;
}

The main entry point is the static method sanitize, which takes a table record and a field number, and gives back a table record. The function will only clean one field at a time, because that is how the function we modify on the table side will also behave. Defined within the class are a list of excluded characters, which can be found in the excludedCharacters method. The characters can be overridden by adding a static SanitizeCharacters method directly on the table. Additionally, you can exclude specific table fields by adding a static ExcludeSanitizeFields method on the table. Note if you specify a SanitizeCharacters method, it will completely overwrite the defaults (newline/linefeed, linefeed/newline, newline, linefeed, tab and quotation mark).

public static Set ExcludeSanitizeFields()
{
    Set     tempSet = new Set(Types::Integer);
    ;
    
    tempSet.add(fieldNum(InventTable, ItemNotes));
    tempSet.add(fieldNum(InventTable, SpecificProductionNotes));
    
    return tempSet;
}

public static List SanitizeCharacters()
{
    excludedCharacters = new List(Types::Container);

    //Container mappings are as follows:
    //[0] = Character(s) to replace
    //[1] = Replacement value if character(s) are found at the beginning or end of the string
    //[2] = Replacement value if character(s) are not at the beginning or end of the string
    excludedCharacters.addEnd(["\r\n", ""  , " " ]);
    excludedCharacters.addEnd(["\n\r", ""  , " " ]);
    excludedCharacters.addEnd(["\n"  , ""  , " " ]);
    excludedCharacters.addEnd(["\r"  , ""  , " " ]);

    return excludedCharacters;
}

The excluded character list is actually a list of containers, each with 3 elements: the character(s) to look for, the character(s) to replace the first value with if they are found at the beginning or end of the string, and the character(s) to replace the first value with if they are found anywhere else in the string. This was because we wanted to define different behavior depending on where the strings were found - in some cases, we wanted to just remove them, and in others we wanted to replace them with spaces. The excluded fields are a list of integers, representing the field numbers for that table.

Finally, here is how we link this functionality into our table. Our code is triggered from the modifiedField method of the table:

public void modifiedField(fieldId _fieldId)
{
    super(_fieldId);

    this = SysInputSanitation::sanitize(this, _fieldId);

    switch (fieldExt2Id(_fieldId))
    {
        .
        .
        .
    }
}

Because we set up the main entry point as a static method, we do not need to worry about instantiating the class etc. This does cause some overhead, as the class will instantiate itself for each request (that is, for every modified field), but we can reasonably rely on the Garbage Collector to help keep that in check. The table record effectively passes through the class, coming out scrubbed on the other end. From there, we can proceed to other field-specific changes outlined in the switch statement, treating the scrubbed data as if that is what the user entered.

The change the values are virtually invisible to the user, especially as far as the non-printing characters are concerned.

I need to point out the reason we include "\r\n" as a separate entry from "\r" and "\n" in the excluded characters list. Normally, these two characters do come in pairs. If we left the list to just read "\r" and "\n" individually, only one of the pair on each end of the string would be trimmed, while the rest would be treated as inside the string and replaced with a different character. For example, running the value "\r\n\r\n\r\n\r\nValue\r\n\r\n\r\n\r\n" would result in the value of " Value " (with 7 spaces on each side), instead of "Value" (with no spaces) as we would expect.

Finally, now that future inputs are cleaned up, all that is left is cleaning up historical data. You can easily wrap this into a one-time job that would clean some (or all) fields in a table, depending on how you write it. Once that is completed, the original modification can be backed out and the original functionality of the Export to Excel can be restored.

The AX Grind

Export to Excel problems: A Follow up

James McCollum

author.bio

author.job