Comma Separated Volumes

In my professional work I have had to do a lot of custom coding for things, just because people I work with are “special” and generally don’t do things in a standard way. I’m fine with that, actually, I love that. There are some things that I feel as though I shouldn’t have to “custom code” for though, reading from CSV (comma separated values) is one of them.

With the project I’m currently working on, I need to read data in from a CSV file (plain text file, columns are separated by commas and there is a header row). The file is about 27,000 lines long. Biggest problem is that the columns aren’t always in the same order and sometimes there are different (additional) columns present. I thought for sure that going into the world of open source I would find some sort of pre-written tool, class or thingy to read the file for me and let me get the columns I want from each row. So far, my quest has turned up nothing, since apparently these files are always predictable and you just are supposed to just Split() the lines as they are read in and capture the indexes you want. Seems simple enough except for the fact that I have a file that is somewhat unpredictable (as in one iteration may have 26 columns, the next may be 28).

My problem is that I am caught slightly off-guard by the fact that there is nothing pre-written. How is it that such a simple thing has not been written? I suppose once I have finishes my coding it will be my responsibility to clean up the class and release it for the masses.