Fuzzy-CSV

GET FuzzyCsv

FuzzyCsv is published to both maven central and jitpack.io. Starting from 1.7.1 all versions of FuzzyCSV will be published to maven central

GET FUZZYCSV FROM MAVEN CENTRAL

For maven add the dependency to your pom file like this

Maven

<dependency>
     <groupId>io.github.kayr</groupId>
     <artifactId>fuzzy-csv</artifactId>
     <version>${version}</version>
</dependency>

Gradle:

 compile 'io.github.kayr:fuzzy-csv:${version}'

GET FUZZYCSV FROM JITPACK.IO

The to get artefacts from jipack the name space changes from io.github.kayr to com.github.kayr.

For Maven

    ...
    <!-- Add this Repository -->
    <repositories>
        <repository>
            <id>jitpack.io</id>
            <url>https://jitpack.io</url>
        </repository>
    </repositories>
    ...
    <dependency>
            <groupId>com.github.kayr</groupId>
            <artifactId>fuzzy-csv</artifactId>
            <version>${version}</version>
    </dependency>
    ...

For Gradle

repositories {
    /* Add this repository */
    maven { url 'https://jitpack.io' }
}

dependencies{
    ...
    compile 'com.github.kayr:fuzzy-csv:${version}'
    ...
}

USAGE

EXAMPLES

Goto Examples showing usages of FuzzyCSV

DATA INITIALIZATION

Loading Data into Fuzzy-CSV

FuzzyCSVTable.fromResultSet(sqlResultSet)
FuzzyCSVTable.fromSqlQuery(groovySql, "select * from table")
FuzzyCSVTable.fromListList(listMap)
FuzzyCSVTable.fromMapList(listOfLists)
FuzzyCSVTable.fromJsonText('''[["colum"],["value1"]]''')
//parse
FuzzyCSVTable.fromCsvString(csvString)
FuzzyCSVTable.fromCsvReader(reader)
//if you wish to customise the parsing you can provide more options
FuzzyCSVTable.fromCsvString(csvString, separator/* , */, quoteChar /* " */, escapeChar /* \ */)

Visualize json data in a console grid table

You can convert json text to a grid console table.

import fuzzycsv.FuzzyCSVTable

def jsonText = '''
{
  "id": "0001",
  "type": "donut",
  "name": "Cake",
  "ppu": 0.55,
  "batters":
  {
    "batter":
    [
      { "id": "1001", "type": "Regular" },
      { "id": "1002", "type": "Chocolate","color": "Brown" }
    ]
  },
  "topping":
  [
    { "id": "5001", "type": "None" },
    { "id": "5002", "type": "Glazed" },
    { "id": "5005", "type": "Sugar" ,"color": "Brown"}
  ]
}'''

FuzzyCSVTable.fromJsonText(jsonText)
        .asListGrid()
        .printTable()

Output

╔═════════╤═══════════════════════════════════════════╗
║ key     │ value                                     ║
╠═════════╪═══════════════════════════════════════════╣
║ id      │ 0001                                      ║
╟─────────┼───────────────────────────────────────────╢
║ type    │ donut                                     ║
╟─────────┼───────────────────────────────────────────╢
║ name    │ Cake                                      ║
╟─────────┼───────────────────────────────────────────╢
║ ppu     │ 0.55                                      ║
╟─────────┼───────────────────────────────────────────╢
║ batters │ ╔════════╤══════════════════════════════╗ ║
║         │ ║ key    │ value                        ║ ║
║         │ ╠════════╪══════════════════════════════╣ ║
║         │ ║ batter │ ╔══════╤═══════════╤═══════╗ ║ ║
║         │ ║        │ ║ id   │ type      │ color ║ ║ ║
║         │ ║        │ ╠══════╪═══════════╪═══════╣ ║ ║
║         │ ║        │ ║ 1001 │ Regular   │ -     ║ ║ ║
║         │ ║        │ ╟──────┼───────────┼───────╢ ║ ║
║         │ ║        │ ║ 1002 │ Chocolate │ Brown ║ ║ ║
║         │ ║        │ ╚══════╧═══════════╧═══════╝ ║ ║
║         │ ╚════════╧══════════════════════════════╝ ║
╟─────────┼───────────────────────────────────────────╢
║ topping │ ╔══════╤════════╤═══════╗                 ║
║         │ ║ id   │ type   │ color ║                 ║
║         │ ╠══════╪════════╪═══════╣                 ║
║         │ ║ 5001 │ None   │ -     ║                 ║
║         │ ╟──────┼────────┼───────╢                 ║
║         │ ║ 5002 │ Glazed │ -     ║                 ║
║         │ ╟──────┼────────┼───────╢                 ║
║         │ ║ 5005 │ Sugar  │ Brown ║                 ║
║         │ ╚══════╧════════╧═══════╝                 ║
╚═════════╧═══════════════════════════════════════════╝

RECORD FUNCTIONS

These Help you write expression or functions for a record.E.g A function multiplying price by quantity.The record function run in two modes:

One with type coercion which can be created using RecordFx.fn{}.This mode is lenient and does not throw most exceptions.This mode supports division of nulls(null/2), zero(2/0) division and type coercion("2"/2 or Object/null) . This mode adds extra overhead and is much slower if your are dealing with lots of records.
Another mode is RecordFx.fx{} which uses the default groovy evaluator.This mode is much faster if you are working with lots of records.However, this mode is not lenient and hence can throw java.lang.ArithmeticException: Division by zero.If you want to enable leniency but still want to use the faster RecordFx.fx{} you can wrap your code in the fuzzycsv.FxExtensions category(e.g use(FxExtensions){ …code here.. }) So the category is registered only once as compared to the former where the category is reqistered on each and every evaluation.

DATA MERGING

Merging with a fuzzy match

Set the accuracy threshold to 75%
Merge using code below

import fuzzycsv.FuzzyCSV
import fuzzycsv.FuzzyCSVTable

def csv1 = [
        ['first name', 'sex'],
        ['alex', 'male'],
        ['sara', 'female']
]

def csv2 = [
        ['ferts nama', 'age', 'sex'],
        ['alex', '21', 'male'],
        ['peter', '21', 'male']
]

//set accuracy threshold
FuzzyCSV.ACCURACY_THRESHOLD.set(0.75)

FuzzyCSVTable.tbl(csv1)
        .mergeByColumn(csv2)
        .printTable()

Output

╔════════════╤════════╤═════╗
║ first name │ sex    │ age ║
╠════════════╪════════╪═════╣
║ alex       │ male   │ -   ║
╟────────────┼────────┼─────╢
║ sara       │ female │ -   ║
╟────────────┼────────┼─────╢
║ alex       │ male   │ 21  ║
╟────────────┼────────┼─────╢
║ peter      │ male   │ 21  ║
╚════════════╧════════╧═════╝

Notice how it merged [first name] and [ferts nama]

Joins

Inner join

tbl(csv1).join(csv2, 'name')
        .printTable()

Output

╔══════╤══════╤═════╤════════╗
║ name │ sex  │ age │ hobby  ║
╠══════╪══════╪═════╪════════╣
║ alex │ male │ 21  │ biking ║
╚══════╧══════╧═════╧════════╝

Left Join

tbl(csv1).leftJoin(csv2, 'name')
        .printTable()

Output

╔══════╤════════╤═════╤════════╗
║ name │ sex    │ age │ hobby  ║
╠══════╪════════╪═════╪════════╣
║ alex │ male   │ 21  │ biking ║
╟──────┼────────┼─────┼────────╢
║ sara │ female │ -   │ -      ║
╚══════╧════════╧═════╧════════╝

Right Join

tbl(csv1).rightJoin(csv2, 'name')
        .printTable()

Output

╔═══════╤══════╤═════╤══════════╗
║ name  │ sex  │ age │ hobby    ║
╠═══════╪══════╪═════╪══════════╣
║ alex  │ male │ 21  │ biking   ║
╟───────┼──────┼─────┼──────────╢
║ peter │ -    │ 21  │ swimming ║
╚═══════╧══════╧═════╧══════════╝

Full Join

tbl(csv1).fullJoin(csv2, 'name')
        .printTable()

Output

╔═══════╤════════╤═════╤══════════╗
║ name  │ sex    │ age │ hobby    ║
╠═══════╪════════╪═════╪══════════╣
║ alex  │ male   │ 21  │ biking   ║
╟───────┼────────┼─────┼──────────╢
║ sara  │ female │ -   │ -        ║
╟───────┼────────┼─────┼──────────╢
║ peter │ -      │ 21  │ swimming ║
╚═══════╧════════╧═════╧══════════╝

Join With Custom Function

tbl(csv1).fullJoin(tbl(csv2)) { it.left('name') == it.right('name') }
        .printTable()

Output

╔══════╤════════╤═══════╤═════╤══════════╗
║ name │ sex    │ name  │ age │ hobby    ║
╠══════╪════════╪═══════╪═════╪══════════╣
║ alex │ male   │ alex  │ 21  │ biking   ║
╟──────┼────────┼───────┼─────┼──────────╢
║ sara │ female │ -     │ -   │ -        ║
╟──────┼────────┼───────┼─────┼──────────╢
║ -    │ -      │ peter │ 21  │ swimming ║
╚══════╧════════╧═══════╧═════╧══════════╝

DATA QUERYING

Iterating over records

You can use any of the groovy methods to iterate of a FuzzyCSVTable since it implements Iterable.

import static fuzzycsv.FuzzyStaticApi.*

def csv = [
        ['name', 'age', 'hobby'],
        ['alex', '21', 'biking'],
        ['peter', '21', 'swimming']
]


assert tbl(csv).collect { it.name } == ['alex', 'peter']
assert tbl(csv).inject('') { acc, record -> acc + record.name + ',' } == 'alex,peter,'
tbl(csv).each { println(it.name) }

Output

alex
peter

Get Cell Value or First Value

assert tbl(csv)['name'][0] == 'alex'
assert tbl(csv)['name'][1] == 'peter'
assert tbl(csv).firstCell() == 'alex'

Filter Records

def csv = [
        ['name', 'age', 'hobby'],
        ['alex', 21, 'biking'],
        ['peter', 13, 'swimming']]

tbl(csv).filter { it.name == 'alex' }
        .printTable()

Output

╔══════╤═════╤════════╗
║ name │ age │ hobby  ║
╠══════╪═════╪════════╣
║ alex │ 21  │ biking ║
╚══════╧═════╧════════╝

Sorting

def csv2 = [
        ['name', 'age', 'hobby'],
        ['alex', '21', 'biking'],
        ['martin', '40', 'swimming'],
        ['dan', '25', 'swimming'],
        ['peter', '21', 'swimming'],
]

def sorted1 = tbl(csv2).sort('age', 'name').printTable()

//or sort using closure
def sorted2 = tbl(csv2).sort { "$it.age $it.name" }.printTable()

Output

---- SORTED WITH COLUMN NAMES
╔════════╤═════╤══════════╗
║ name   │ age │ hobby    ║
╠════════╪═════╪══════════╣
║ alex   │ 21  │ biking   ║
╟────────┼─────┼──────────╢
║ peter  │ 21  │ swimming ║
╟────────┼─────┼──────────╢
║ dan    │ 25  │ swimming ║
╟────────┼─────┼──────────╢
║ martin │ 40  │ swimming ║
╚════════╧═════╧══════════╝


---- SORTED WITH CLOSURE
╔════════╤═════╤══════════╗
║ name   │ age │ hobby    ║
╠════════╪═════╪══════════╣
║ alex   │ 21  │ biking   ║
╟────────┼─────┼──────────╢
║ peter  │ 21  │ swimming ║
╟────────┼─────┼──────────╢
║ dan    │ 25  │ swimming ║
╟────────┼─────┼──────────╢
║ martin │ 40  │ swimming ║
╚════════╧═════╧══════════╝

Ranges

Ranges help slice the table data. e.g. selecting last 2, top 2, 3rd to 2nd last record

def csv = [
        ['name', 'age', 'hobby'],
        ['alex', '21', 'biking'],
        ['martin', '40', 'swimming'],
        ['dan', '25', 'swimming'],
        ['peter', '21', 'swimming'],
]


//top 2
def firstTwo = tbl(csv)[1..2].printTable()

//last 2
def lastTwo = tbl(csv)[-1..-2].printTable()

//2nd and 3rd
def middle = tbl(csv)[2..3].printTable()

Output

---- TOP 2
╔════════╤═════╤══════════╗
║ name   │ age │ hobby    ║
╠════════╪═════╪══════════╣
║ alex   │ 21  │ biking   ║
╟────────┼─────┼──────────╢
║ martin │ 40  │ swimming ║
╚════════╧═════╧══════════╝


---- LAST 2
╔═══════╤═════╤══════════╗
║ name  │ age │ hobby    ║
╠═══════╪═════╪══════════╣
║ peter │ 21  │ swimming ║
╟───────┼─────┼──────────╢
║ dan   │ 25  │ swimming ║
╚═══════╧═════╧══════════╝


---- MIDDLE 2
╔════════╤═════╤══════════╗
║ name   │ age │ hobby    ║
╠════════╪═════╪══════════╣
║ martin │ 40  │ swimming ║
╟────────┼─────┼──────────╢
║ dan    │ 25  │ swimming ║
╚════════╧═════╧══════════╝

Record Navigation

Example showing running sum

@Grab('io.github.kayr:fuzzy-csv:1.9.1-groovy4')
import static fuzzycsv.FuzzyStaticApi.*

def csv = [["name", "age"],
           ["kay", 1],
           ["sa", 22],
           ["kay2", 1],
           ["ben", 10]]


//add running sum of age

tbl(csv).addColumn(fx("running_sum") { (it.up()?.running_sum ?: 0) + it.age })
        .printTable()

Output

╔══════╤═════╤═════════════╗
║ name │ age │ running_sum ║
╠══════╪═════╪═════════════╣
║ kay  │ 1   │ 1           ║
╟──────┼─────┼─────────────╢
║ sa   │ 22  │ 23          ║
╟──────┼─────┼─────────────╢
║ kay2 │ 1   │ 24          ║
╟──────┼─────┼─────────────╢
║ ben  │ 10  │ 34          ║
╚══════╧═════╧═════════════╝

Or sum bottom value with current value

tbl(csv).addColumn(fx("bottom_up") { (it.down().age ?: 0) + it.age }).printTable()

Cell Navigation

Navigators help move through the table cells easily.You can look above,below, right or left of a cell.

import fuzzycsv.nav.Navigator

import static fuzzycsv.FuzzyStaticApi.tbl

def csv = [
        ['name', 'age', 'hobby', 'category'],
        ['alex', '21', 'biking', 'A'],
        ['peter', '21', 'swimming', 'S'],
        ['charles', '21', 'swimming', 'S'],
        ['barbara', '23', 'swimming', 'S']
]

def navigator = new Navigator(0, 0, tbl(csv))


assert navigator.down().down().value() == 'peter'
assert navigator.right().value() == 'age'
assert navigator.right().left().value() == 'name'
assert navigator.down().up().value() == 'name'

// Move down
assert navigator.downIter().collect { it.value() } == ['name', 'alex', 'peter', 'charles', 'barbara']

// MoveRight
assert navigator.rightIter().collect { it.value() } == ['name', 'age', 'hobby', 'category']

//move through all
assert navigator.allIter().collect { it.value() } == ['name', 'age', 'hobby', 'category', 'alex', '21', 'biking', 'A', 'peter', '21', 'swimming',
                                                      'S', 'charles', '21', 'swimming', 'S', 'barbara', '23', 'swimming', 'S']
//move through all bounded
assert navigator.allBoundedIter(1, 2).collect { it.value() } == ['name', 'age', 'alex', '21', 'peter', '21']

//move up
assert new Navigator(0, 4, tbl(csv)).upIter().collect { it.value() } == ['barbara', 'charles', 'peter', 'alex', 'name']

Select function

To shoe the difference of how Record.fx and Record.fn observe the output of the following code snippet

def csv2 = [
        ['price', 'quantity'],
        ['2', 5],
        ['3', 4]
]

//using the fn functions demonstrates how you can do arithmetic on strings and will get a valid result
def r1 = tbl(csv2)
        .select('price', 'quantity', fn('total') { it.price * it.quantity })
        .printTable()

// here we will use the fx function notice the output will be a repeated string.
def r2 = tbl(csv2)
        .select('price', 'quantity', fx('total') { it.price * it.quantity })
        .printTable()

Output

--- USING FN FUNCTION
╔═══════╤══════════╤═══════╗
║ price │ quantity │ total ║
╠═══════╪══════════╪═══════╣
║ 2     │ 5        │ 10    ║
╟───────┼──────────┼───────╢
║ 3     │ 4        │ 12    ║
╚═══════╧══════════╧═══════╝


--- USING FX FUNTION
╔═══════╤══════════╤═══════╗
║ price │ quantity │ total ║
╠═══════╪══════════╪═══════╣
║ 2     │ 5        │ 22222 ║
╟───────┼──────────┼───────╢
║ 3     │ 4        │ 3333  ║
╚═══════╧══════════╧═══════╝

DATA MANIPULATION FUNCTIONS

Delete Records

def csv = [
        ['name', 'age', 'hobby', 'category'],
        ['alex', '21', 'biking', 'A'],
        ['peter', '21', 'swimming', 'S'],
        ['charles', '21', 'swimming', 'S'],
        ['barbara', '23', 'swimming', 'S']]

tbl(csv).delete { it.age == '21' }
        .printTable()

Output

╔═════════╤═════╤══════════╤══════════╗
║ name    │ age │ hobby    │ category ║
╠═════════╪═════╪══════════╪══════════╣
║ barbara │ 23  │ swimming │ S        ║
╚═════════╧═════╧══════════╧══════════╝

Deduplicate by column

def csv = [
        ['name', 'age', 'hobby', 'category'],
        ['alex', '21', 'biking', 'A'],
        ['peter', '21', 'swimming', 'S'],
        ['charles', '21', 'swimming', 'S'],
        ['barbara', '23', 'swimming', 'S']]

tbl(csv).distinctBy('age', 'category')
        .printTable()

Output

╔═════════╤═════╤══════════╤══════════╗
║ name    │ age │ hobby    │ category ║
╠═════════╪═════╪══════════╪══════════╣
║ alex    │ 21  │ biking   │ A        ║
╟─────────┼─────┼──────────┼──────────╢
║ peter   │ 21  │ swimming │ S        ║
╟─────────┼─────┼──────────┼──────────╢
║ barbara │ 23  │ swimming │ S        ║
╚═════════╧═════╧══════════╧══════════╝

Adding Records

def csv = tbl([
        ['name', 'number'],
        ['join', 1.1]])

csv.addRecordArr("JB", 455)
        .addRecord(["JLis", 767])
        .addRecordMap([name: "MName", number: 90])
        .addRecordArr()
        .addRecordMap([name: "MNameEmp"])
        .printTable()

Output

╔══════════╤════════╗
║ name     │ number ║
╠══════════╪════════╣
║ join     │ 1.1    ║
╟──────────┼────────╢
║ JB       │ 455    ║
╟──────────┼────────╢
║ JLis     │ 767    ║
╟──────────┼────────╢
║ MName    │ 90     ║
╟──────────┼────────╢
║ -        │ -      ║
╟──────────┼────────╢
║ MNameEmp │ -      ║
╚══════════╧════════╝

Update values with where clause

def csv2 = [
        ['name', 'age', 'hobby'],
        ['alex', '21', 'biking'],
        ['martin', '40', 'swimming'],
        ['dan', '25', 'swimming'],
        ['peter', '21', 'swimming'],
]

tbl(csv2).modify {
    set {
        it.hobby = "running"
        it.age  = '900'
    }
    where {
        it.name in ['dan', 'alex']
    }
}.printTable()

Output

╔════════╤═════╤══════════╗
║ name   │ age │ hobby    ║
╠════════╪═════╪══════════╣
║ alex   │ 900 │ running  ║
╟────────┼─────┼──────────╢
║ martin │ 40  │ swimming ║
╟────────┼─────┼──────────╢
║ dan    │ 900 │ running  ║
╟────────┼─────┼──────────╢
║ peter  │ 21  │ swimming ║
╚════════╧═════╧══════════╝

Transform each cell record

def table = tbl([
        ['name', 'age', 'hobby'],
        ['alex', '21', 'biking'],
        ['martin', '40', 'swimming'],
        ['dan', '25', 'swimming'],
        ['peter', '21', 'swimming'],
])
table.transform { it.padRight(10, '-') }.printTable()

Output

╔════════════╤════════════╤════════════╗
║ name       │ age        │ hobby      ║
╠════════════╪════════════╪════════════╣
║ alex------ │ 21-------- │ biking---- ║
╟────────────┼────────────┼────────────╢
║ martin---- │ 40-------- │ swimming-- ║
╟────────────┼────────────┼────────────╢
║ dan------- │ 25-------- │ swimming-- ║
╟────────────┼────────────┼────────────╢
║ peter----- │ 21-------- │ swimming-- ║
╚════════════╧════════════╧════════════╝

DATA RESHAPING FUNCTIONS (COLUMN MANIPULATION)

Delete Column

tbl(csv2).delete('name', 'age').printTable()

Output

╔══════════╗
║ hobby    ║
╠══════════╣
║ biking   ║
╟──────────╢
║ swimming ║
╚══════════╝

Add Column

tbl(csv).addColumn(fx('Double Age') { it.age * 2 })
        .printTable()

Output

╔═══════╤═════╤══════════╤════════════╗
║ name  │ age │ hobby    │ Double Age ║
╠═══════╪═════╪══════════╪════════════╣
║ alex  │ 21  │ biking   │ 42         ║
╟───────┼─────┼──────────┼────────────╢
║ peter │ 13  │ swimming │ 26         ║
╚═══════╧═════╧══════════╧════════════╝

Transposing

tbl(csv2).transpose()
        .printTable()

Output

╔═══════╤════════╤══════════╤══════════╤══════════╗
║ name  │ alex   │ martin   │ dan      │ peter    ║
╠═══════╪════════╪══════════╪══════════╪══════════╣
║ age   │ 21     │ 40       │ 25       │ 21       ║
╟───────┼────────┼──────────┼──────────┼──────────╢
║ hobby │ biking │ swimming │ swimming │ swimming ║
╚═══════╧════════╧══════════╧══════════╧══════════╝

Unwinding a column

This is kind can be used to unwind a column which has nested lists or collections

import static fuzzycsv.FuzzyStaticApi.*

def csv = [
        ['name',     'AgeList'  ],
        ['biking',   [21,16]    ],
        ['swimming', [21,15]    ]
]


tbl(csv).unwind('AgeList')
        .printTable()

Output

╔══════════╤═════════╗
║ name     │ AgeList ║
╠══════════╪═════════╣
║ biking   │ 21      ║
╟──────────┼─────────╢
║ biking   │ 16      ║
╟──────────┼─────────╢
║ swimming │ 21      ║
╟──────────┼─────────╢
║ swimming │ 15      ║
╚══════════╧═════════╝

Spreading a column

Expand outwards a column which contains list items

import static fuzzycsv.FuzzyStaticApi.*

def csv = [
        ['name',     'AgeList'  ],
        ['biking',   [21,16]    ],
        ['swimming', [21,15]    ]
]


tbl(csv).spread('AgeList')
        .printTable()

Output

╔══════════╤═══════════╤═══════════╗
║ name     │ AgeList_1 │ AgeList_2 ║
╠══════════╪═══════════╪═══════════╣
║ biking   │ 21        │ 16        ║
╟──────────┼───────────┼───────────╢
║ swimming │ 21        │ 15        ║
╚══════════╧═══════════╧═══════════╝

Spread out a column with maps

def csv = [
        ['name', 'Age'],
        ['biking', [age: 21, height: 16]],
        ['swimming', [age: 21, height: 15]]
]

tbl(csv).spread('Age')
        .printTable()

Output

╔══════════╤═════════╤════════════╗
║ name     │ Age_age │ Age_height ║
╠══════════╪═════════╪════════════╣
║ biking   │ 21      │ 16         ║
╟──────────┼─────────┼────────────╢
║ swimming │ 21      │ 15         ║
╚══════════╧═════════╧════════════╝

Spread with custom column names

import static fuzzycsv.FuzzyStaticApi.*

def csv = [
        ['name', 'Age'],
        ['biking', [age: 21, height: 16]],
        ['swimming', [age: 21, height: 15]]
]

tbl(csv).spread(spreader("Age") { col, key -> "MyColName: $key" })
        .printTable()

Output

╔══════════╤════════════════╤═══════════════════╗
║ name     │ MyColName: age │ MyColName: height ║
╠══════════╪════════════════╪═══════════════════╣
║ biking   │ 21             │ 16                ║
╟──────────┼────────────────┼───────────────────╢
║ swimming │ 21             │ 15                ║
╚══════════╧════════════════╧═══════════════════╝

Move column

import static fuzzycsv.FuzzyStaticApi.tbl

def csv = [
        ['name', 'age', 'hobby', 'category'],
        ['alex', '21', 'biking', 'A'],
        ['peter', '21', 'swimming', 'S'],
        ['charles', '21', 'swimming', 'S'],
        ['barbara', '23', 'swimming', 'S']
]
tbl(csv).moveCol("age", "category")
        .printTable()

Output

╔═════════╤══════════╤══════════╤═════╗
║ name    │ hobby    │ category │ age ║
╠═════════╪══════════╪══════════╪═════╣
║ alex    │ biking   │ A        │ 21  ║
╟─────────┼──────────┼──────────┼─────╢
║ peter   │ swimming │ S        │ 21  ║
╟─────────┼──────────┼──────────┼─────╢
║ charles │ swimming │ S        │ 21  ║
╟─────────┼──────────┼──────────┼─────╢
║ barbara │ swimming │ S        │ 23  ║
╚═════════╧══════════╧══════════╧═════╝

AGGREGATION FUNCTIONS

Simplistic Aggregations

In the example below we find the average age in each hobby by making use of sum count and group by functions

import static fuzzycsv.FuzzyStaticApi.*

def csv2 = [
        ['name', 'age', 'Hobby'],
        ['alex', '21', 'biking'],
        ['peter', '21', 'swimming'],
        ['davie', '15', 'swimming'],
        ['sam', '16', 'biking'],
]


tbl(csv2).summarize(

        'Hobby',

        sum('age').az('TT.Age'),

        count('name').az('TT.Count')
).printTable()

Output

╔══════════╤════════╤══════════╗
║ Hobby    │ TT.Age │ TT.Count ║
╠══════════╪════════╪══════════╣
║ biking   │ 37     │ 2        ║
╟──────────┼────────┼──────────╢
║ swimming │ 36     │ 2        ║
╚══════════╧════════╧══════════╝

Custom Aggregation

tbl(csv2).summarize(
        'Hobby',
        reduce { group -> group['age'] }.az('AgeList')
).printTable()

Output

╔══════════╤══════════╗
║ Hobby    │ AgeList  ║
╠══════════╪══════════╣
║ biking   │ [21, 16] ║
╟──────────┼──────────╢
║ swimming │ [21, 15] ║
╚══════════╧══════════╝

EXPORT FUNCTIONS

CSV To MapList

tbl(csv).toMapList()

Output

[[name:alex, age:21, hobby:biking], [name:peter, age:21, hobby:swimming]]

CSV To POJO List

@Grab('io.github.kayr:fuzzy-csv:1.9.1-groovy4')
import static fuzzycsv.FuzzyCSVTable.tbl

def csv = [
        ['name', 'age', 'hobby'],
        ['alex', '21', 'biking'],
        ['peter', '21', 'swimming']]

class Person {
    String name
    Integer age
    String hobby
}

List<Person> people = tbl(csv).toPojoList(Person.class)

assert people.size() == 3
assert people.first().name == 'alex'
assert people.first().age == 21

Sql To CSV

groovy.sql.Sql groovySql = null//...
FuzzyCSVTable.toCSV(groovySql,"select * from PERSON")

ResultSet resultSet = null//..
FuzzyCSVTable.toCSV(resultSet)

EXCEL UTILITY CLASSES

To use the Excel utilities you have to add the poi dependency to your classpath

If you are using gradle add this.

 compile 'org.apache.poi:poi-ooxml:3.16', {
     exclude group: 'stax', module: 'stax-api'
 }
 compile 'org.apache.poi:ooxml-schemas:1.3', {
     exclude group: 'stax', module: 'stax-api'
 }

After this you can use the Excel utilities to convert excel sheets to and from FuzzyCSVTables.

There are mainly two classes that help with this which include fuzzycsv.Excel2Csv and fuzzycsv.CSVToExcel

Note:

This library has not been tested with very large(700,000 records plus) CSV files.So performance might be a concern.

More example can be seen here

https://github.com/kayr/fuzzy-csv/blob/master/src/test/groovy/fuzzycsv/FuzzyCSVTest.groovy

and

https://github.com/kayr/fuzzy-csv/blob/master/src/test/groovy/fuzzycsv/FuzzyCSVTableTest.groovy