tags:

views:

1186

answers:

7

Just saw a crash as we exceeded 255 columns. Maybe this question should go directly to POI, but lets say I do not want to disturb them in their effort to further develop the increadibly useful API it already is. ;-) The limitations page is not increadibly detailed.

So: What are your experience of actual limitations assuming the output shall be readable in Excel? Has anyone used POI to benchmark & explore soft & hard limitations of POI-generated Excel files?

The only limitations I can quickly find in the POI interface is the below. Microsoft lists further limitations for Excel that do not seem to be covered in POI.

EDIT: Oops. Just realized we have not refreshed POI in the last 5 years, so the code below may have been replaced 100 times over since then.

EDIT: The code below has not changed since then and ver 3.2 of Oct 19, 2008.

/**
 * @throws RuntimeException if the bounds are exceeded.
 */
private void checkBounds(int cellNum) {
  if (cellNum > 255) {
      throw new RuntimeException("You cannot have more than 255 columns "+
                "in a given row (IV).  Because Excel can't handle it");
  }
  else if (cellNum < 0) {
      throw new RuntimeException("You cannot reference columns with an index of less then 0.");
  }
}
+2  A: 

One of the biggest limitations I found with POI for writing Excel files was that it keeps the entire file contents in memory before it is written to file. For very large files (many rows), this became a real problem, leading to frequent OutOfMemory exceptions.

However, like you, this was with a very old version of POI. I am not sure if newer versions use memory more efficiently.

David Sykes
+1  A: 

Another serious limitation (not very well explained in my opinion) is that HSSFCellStyle are limited inside a workbook (I think this is an excel limitation).

You should not create a new style on every cell (because then, excel won't be able to open your workbook) but you have to keep a reference to them and reapply them when the cell style is similar.

So, you'll have to manage an internal cache of HSSFCellStyle, like this one for example : CellStyleCacheManager.

paulgreg
According to MSFT link above (http://office.microsoft.com/en-gb/excel/HP051992911033.aspx), you can have 4000 styles in a wb.
fredarin
Thanks to get the precise limit number ! That sounds big enough, hoewver, if you build a new CellStyle on each cell, you'll be limited to 266 rows of 15 columns (15*266=4000)... So, I think it's a limitation to keep in mind.
paulgreg
+1  A: 

At paulgreg concerning your CellStyleCacheManager: Although it is true, that this is a way to reuse styles, your setCellStyle() method expects an HSSFCellStyle parameter and the only way to create an HSSFCellStyle that I know of is to register it with the workbook by calling it's createCellStyle() method.

Although the cells actually use less styles, don't you end up with the same number of Styles registered with the workbook as without the cache? Or is there some kind of purging of unused styles in HSSF that I'm not aware of?

+1  A: 

Marliese, I get annoyed by this mistake in poi framework, and understood that I need a styleManager. These posts, make me feel all the work is done, until I came up the same conclusion as you. I don't want to reinvent the wheel, So I download the source of his framework and search for the uses of CellStyleManager.setCellStyle(). The fact is that in the code, two initial HSSFCellStyle objects are created, defaultCellStyle and cellStyle. Any time you customize a style use cellStyle, and then with CellStyleManager set it. If the style exists, it is reused, if not, it is created. Any other attemp to customize another style starts with a reset of cellStyle using the CellStyleHelper's function and the defaultCellStyle, which is left untouched for all the program. So in the end you will get two styles above the real needed, but so far better than use another api.

As paulgreg says the code is spread all over the framework but I join all the code is needed in just two classes. I left them here, until I wrote to paulgreg and poi develop team to incorporate it on its jar, because I think that for unknow data to write in an excel you need this kind of manager.

the changes are basically, that the manager knows the workbook, provides the style object, and implements the code of the CellStyleHelper. (They are less general because the manager needs to know the workbook and, overall, because you must use only a call of getGeneralStyle a time (because is the same object an it is reset in any call, but for the general use is the code that fits ) So to use it:

... Creates a workbook
CellStyleManager styleManager = new CellStyleManager(workbook);
... Create a cell
HSSFCellStyle style = styleManager.getGeneralStyle();
styleManager.setCellStyle(cell, style); // no more 4000 styles error!

The code: Thanks PaulGreg!

// CellStyleCacheManager.java

public class CellStyleCacheManager {
    protected Set cellStyles;
    protected HSSFCellStyle cellStyle;
    protected HSSFCellStyle defaultValuesCellStyle;
    protected HSSFWorkbook workbook;

    public CellStyleCacheManager(HSSFWorkbook workbook) {
     this.workbook = workbook;
     this.cellStyles = new HashSet();
     // El desperdicio de estilos será pués de dos
     cellStyle = workbook.createCellStyle();
     // Estilo almacenado para reiniciar el que se va a usar
     defaultValuesCellStyle = workbook.createCellStyle();
    }

    /** Si el estilo se crea con createCellStyle, ya no podremos hacer nada */
    public void setCellStyle(HSSFCell cell, HSSFCellStyle cellStyle) {
     CellStyleWrapper cellStyleWrp = new CellStyleWrapper(cellStyle);
     CellStyleWrapper cachedCellStyleWrp = null;

     Iterator it = cellStyles.iterator();

     while(it.hasNext() && (cachedCellStyleWrp == null)) {
      CellStyleWrapper tmpCachedCellStyleWrp = (CellStyleWrapper) it.next();

      if(tmpCachedCellStyleWrp.equals(cellStyleWrp)) {
       // Si algún estilo coincide con el actual usamos ese
       cachedCellStyleWrp = tmpCachedCellStyleWrp;
      }
     }

     if(cachedCellStyleWrp == null) {
      // Si el estilo no existe creamos uno nuevo
      HSSFCellStyle newCellStyle = workbook.createCellStyle();
      CellStyleCacheManager.copyCellStyle(workbook, cellStyle, newCellStyle);

      CellStyleWrapper newWrp = new CellStyleWrapper(newCellStyle);
      cellStyles.add(newWrp);
      cachedCellStyleWrp = newWrp;
     }

     cell.setCellStyle(cachedCellStyleWrp.getHSSFCellStyle());
    }

    public HSSFCellStyle getGeneralStyle() {
     copyCellStyle(workbook, cellStyle, defaultValuesCellStyle);
     return cellStyle;
    }

    public static void copyCellStyle(HSSFWorkbook wb, HSSFCellStyle c1, HSSFCellStyle c2) {
     c2.setAlignment(c1.getAlignment());
     c2.setBorderBottom(c1.getBorderBottom());
     c2.setBorderLeft(c1.getBorderLeft());
     c2.setBorderRight(c1.getBorderRight());
     c2.setBorderTop(c1.getBorderTop());
     c2.setBottomBorderColor(c1.getBottomBorderColor());
     c2.setDataFormat(c1.getDataFormat());
     c2.setFillBackgroundColor(c1.getFillBackgroundColor());
     c2.setFillForegroundColor(c1.getFillForegroundColor());
     c2.setFillPattern(c1.getFillPattern());

     try {
      c2.setFont(wb.getFontAt(c1.getFontIndex()));
     } catch(NullPointerException e) {
      TLogger.getInstance().log(e.getMessage());
     } catch(ArrayIndexOutOfBoundsException e) {
      TLogger.getInstance().log("Be sure to have intialized all POI font objects !\n%s",e.getMessage());
     }

     c2.setHidden(c1.getHidden());
     c2.setIndention(c1.getIndention());
     c2.setLeftBorderColor(c1.getLeftBorderColor());
     c2.setLocked(c1.getLocked());
     c2.setRightBorderColor(c1.getRightBorderColor());
     c2.setRotation(c1.getRotation());
     c2.setTopBorderColor(c1.getTopBorderColor());
     c2.setVerticalAlignment(c1.getVerticalAlignment());
     c2.setWrapText(c1.getWrapText());
    } 
}

CellStyleWrapper.java

public class CellStyleWrapper implements Comparable {
    private HSSFCellStyle cs;
    private int hashCode;

    public CellStyleWrapper(HSSFCellStyle cs) {
     this.cs = cs;
    }

    public boolean equals(Object obj) {
     CellStyleWrapper csWrp_;
     HSSFCellStyle cs_;

     try {
      csWrp_ = (CellStyleWrapper) obj;
     } catch(ClassCastException e) {
      return false;
     }

     cs_ = csWrp_.getHSSFCellStyle();

     return (cs.getAlignment() == cs_.getAlignment()) && (cs.getBorderBottom() == cs_.getBorderBottom())
       && (cs.getBorderLeft() == cs_.getBorderLeft()) && (cs.getBorderRight() == cs_.getBorderRight())
       && (cs.getBorderTop() == cs_.getBorderTop())
       && (cs.getBottomBorderColor() == cs_.getBottomBorderColor())
       && (cs.getDataFormat() == cs_.getDataFormat())
       && (cs.getFillBackgroundColor() == cs_.getFillBackgroundColor())
       && (cs.getFillForegroundColor() == cs_.getFillForegroundColor())
       && (cs.getFillPattern() == cs_.getFillPattern()) && (cs.getFontIndex() == cs_.getFontIndex())
       && (cs.getHidden() == cs_.getHidden()) && (cs.getIndention() == cs_.getIndention())
       && (cs.getLeftBorderColor() == cs_.getLeftBorderColor()) && (cs.getLocked() == cs_.getLocked())
       && (cs.getRightBorderColor() == cs_.getRightBorderColor()) && (cs.getRotation() == cs_.getRotation())
       && (cs.getTopBorderColor() == cs_.getTopBorderColor())
       && (cs.getVerticalAlignment() == cs_.getVerticalAlignment())
       && (cs.getWrapText() == cs_.getWrapText());
    }

    private int v(int i) {
     if(i == 0) {
      return 1;
     } else {
      return i;
     }
    }

    public int hashCode() {
     if(hashCode == 0) {
      hashCode = 17;
      hashCode = 37 * v(cs.getBorderBottom());
      hashCode = 37 * v(cs.getBorderLeft());
      hashCode = 37 * v(cs.getBorderRight());
      hashCode = 37 * v(cs.getBorderTop());
      hashCode = 37 * v(cs.getBottomBorderColor());
      hashCode = 37 * v(cs.getDataFormat());
      hashCode = 37 * v(cs.getFillBackgroundColor());
      hashCode = 37 * v(cs.getFillForegroundColor());
      hashCode = 37 * v(cs.getFillPattern());
      hashCode = 37 * v(cs.getFontIndex());
      hashCode = 37 * (cs.getHidden() ? 1 : (-1));
      hashCode = 37 * v(cs.getIndention());
      hashCode = 37 * v(cs.getLeftBorderColor());
      hashCode = 37 * (cs.getLocked() ? 1 : (-1));
      hashCode = 37 * v(cs.getRightBorderColor());
      hashCode = 37 * v(cs.getRotation());
      hashCode = 37 * v(cs.getTopBorderColor());
      hashCode = 37 * v(cs.getVerticalAlignment());
      hashCode = 37 * (cs.getWrapText() ? 1 : (-1));
     }

     return hashCode;
    }

    public int compareTo(Object obj) {
     int diff = 0;
     CellStyleWrapper csWrp_;
     HSSFCellStyle cs_;

     try {
      csWrp_ = (CellStyleWrapper) obj;
     } catch(ClassCastException e) {
      return -1;
     }

     cs_ = csWrp_.getHSSFCellStyle();

     diff = cs.getAlignment() - cs_.getAlignment();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getBorderBottom() - cs_.getBorderBottom();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getBorderLeft() - cs_.getBorderLeft();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getBorderRight() - cs_.getBorderRight();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getBorderTop() - cs_.getBorderTop();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getBottomBorderColor() - cs_.getBottomBorderColor();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getDataFormat() - cs_.getDataFormat();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getFillBackgroundColor() - cs_.getFillBackgroundColor();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getFillForegroundColor() - cs_.getFillForegroundColor();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getFillPattern() - cs_.getFillPattern();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getFontIndex() - cs_.getFontIndex();

     if(diff != 0) {
      return diff;
     }

     if(cs.getHidden() != cs_.getHidden()) {
      return -1;
     }

     diff = cs.getIndention() - cs_.getIndention();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getLeftBorderColor() - cs_.getLeftBorderColor();

     if(diff != 0) {
      return diff;
     }

     if(cs.getLocked() != cs_.getLocked()) {
      return -1;
     }

     diff = cs.getRightBorderColor() - cs_.getRightBorderColor();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getRotation() - cs_.getRotation();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getTopBorderColor() - cs_.getTopBorderColor();

     if(diff != 0) {
      return diff;
     }

     diff = cs.getVerticalAlignment() - cs_.getVerticalAlignment();

     if(diff != 0) {
      return diff;
     }

     if(cs.getWrapText() != cs_.getWrapText()) {
      return -1;
     }

     return 0;
    }

    public HSSFCellStyle getHSSFCellStyle() {
     return cs;
    }
}

Its funny that in the poi source, in the comments of HSSFCellStyle, comes this entry

// Why would you do that??
protected HSSFCellStyle(...

4000 styles limit man, Is that enough reason?

That hashCode method looks not only rather slow but not very effective as a hash: `if(hashCode == 0) { hashCode = 17; hashCode = 37 * v(cs.getBorderBottom()); hashCode = 37 * v(cs.getBorderLeft()); etc etc` ... hashCode is going to be set to the LAST of that long list of statements, is it not? So the only possible values are 0, 37, and -37 AFAICT.
John Machin
A: 

Really it seems some kind strange, but the way I use the code I don't need the hashCode so, I left that code there. I think that is something paulgreg had started but didn't finish yet.

"some kind strange" is a mild description. And why don't you "need" the hashCode? That's because you iterate over ALL of the cellStyles collection comparing against your new style!! At least a hash with only 3 possible values would give some chance of not doing an O(n) comparison. Aarrgghh!!
John Machin
+1  A: 

@nenopera

I liked your cache classes and converted them to .NET. I think I found a bug though.

In getGeneralStyle() there is a call to copyCellStyle(workbook, cellStyle, defaultValuesCellStyle);. This call copies the values from the cellStyle object into the defaultValuesCellStyle thus overwriting the defaults.

I think we want the opposite so it should be changed to: copyCellStyle(workbook, defaultValuesCellStyle, cellStyle);

jonaskello
A: 

Thank you PaulGreg for the code !

titi4u