Overview

This article explains how to use a batch apex process to read a CSV file. You might need to use this asynchronous process to avoid the "Regex too complicated" error when importing a large amount of data from a CSV file. Suggestions from the developer community include splitting the CSV file, but if you need to read the data in one go, there is no option to process more than one file.

To resolve this issue, you can create a batch apex process to import a large amount of data in one file and process it.

Example Classes

In this article we are using two classes written by Marty Y. Chang.

To read a CSV file, these classes must be provided with the first row of the data and the delimiter used in the file.

Creating a Batch Apex Process

As the community suggests, you must split the file to avoid exceeding the heap size. You can create a batch process to read the file in chunks and define the number of lines to read for each chunk.

To do this, create a batch apex process where the scope size defines the number of records to read for each chunk. In the sample code that follows, lines from a CSV file are to be read rather than records. The start method returns an Iterable<String> that contains the lines to be processed in the execute method. Afterwards, the process reads the list of lines using the CSVReader in the same way as an online process.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
global with sharing class ReadAndPopulateBatch implements Database.batchable<String>, Database.Stateful
{
   private String m_csvFile;
   private Integer m_startRow;
   private CSVParser m_parser;
   private static final Integer SCOPE_SIZE = 100;
   public ReadAndPopulateBatch(){....}
   public static ID run(){....}
   global Iterable<String> start(Database.batchableContext batchableContext)
   {
       return new CSVIterator(m_csvFile, m_parser.crlf);
   }
   global void execute(Database.BatchableContext batchableContext, List<String> scope) 
   {
       //TODO: Create a map with the column name and the position.
       String csvFile = '';
       for(String row : scope)
       {
          csvFile += row + m_parser.crlf;
       }
       List<List<String>> csvLines = CSVReader.readCSVFile(csvFile,m_parser);
       //TODO: csvLines contains a List with the values of the CSV file.
       //These information will be used to create a custom object to
       //process it.
   }
   global void finish(Database.BatchableContext batchableContext){......}
}

Creating the Iterable<String>

You must also create a class that implements Iterator and Iterable. The purpose of the class is to create a string for each line in the file.

Here is an example: 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
global with sharing class CSVIterator implements Iterator<String>, Iterable<String>
{
   private String m_CSVData;
   private String m_introValue;
   public CSVIterator(String fileData, String introValue)
   {
      m_CSVData = fileData;
      m_introValue = introValue;
   }
   global Boolean hasNext()
   {
      return m_CSVData.length() > 1 ? true : false;
   }
   global String next()
   {
      String row = m_CSVData.subString(0, m_CSVData.indexOf(m_introValue));
      m_CSVData = m_CSVData.subString(m_CSVData.indexOf(m_introValue) + m_introValue.length(),m_CSVData.length());
      return row;
   }
   global Iterator<String> Iterator()
   {
      return this;  
   }
}

Use CSVReader.cls to Read the Lines or Create a Custom Method?

Some lines in CSVReader.cls perform the same function as CSVIterator.cls mentioned earlier. You can do one of the following:

  • Use the CSVReader.cls in the execute method. The code runs the same lines twice, but you can use the method without modification.
  • Create a custom method. You can create a custom method that avoids executing the same lines twice.

Here is the example: [code_inline id=2321]

IMPORTANT USAGE NOTICE PLEASE READ 

Copyright © 2011, FinancialForce.com, inc

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
  • Neither the name of the FinancialForce.com, inc nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.