GET FuzzyCsv
FuzzyCsv is published to both maven central and jitpack.io. Starting from 1.7.1 all versions of FuzzyCSV will be published to maven central
GET FUZZYCSV FROM MAVEN CENTRAL
For maven add the dependency to your pom file like this
Maven
<dependency>
<groupId>io.github.kayr</groupId>
<artifactId>fuzzy-csv</artifactId>
<version>${version}</version>
</dependency>
Gradle:
compile 'io.github.kayr:fuzzy-csv:${version}'
GET FUZZYCSV FROM JITPACK.IO
The to get artefacts from jipack the name space changes from io.github.kayr
to com.github.kayr
.
For Maven
...
<!-- Add this Repository -->
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
...
<dependency>
<groupId>com.github.kayr</groupId>
<artifactId>fuzzy-csv</artifactId>
<version>${version}</version>
</dependency>
...
For Gradle
repositories {
/* Add this repository */
maven { url 'https://jitpack.io' }
}
dependencies{
...
compile 'com.github.kayr:fuzzy-csv:${version}'
...
}
USAGE
DATA INITIALIZATION
Loading Data into Fuzzy-CSV
FuzzyCSVTable.fromResultSet(sqlResultSet)
FuzzyCSVTable.fromSqlQuery(groovySql, "select * from table")
FuzzyCSVTable.fromListList(listMap)
FuzzyCSVTable.fromMapList(listOfLists)
FuzzyCSVTable.fromJsonText('''[["colum"],["value1"]]''')
//parse
FuzzyCSVTable.fromCsvString(csvString)
FuzzyCSVTable.fromCsvReader(reader)
//if you wish to customise the parsing you can provide more options
FuzzyCSVTable.fromCsvString(csvString, separator/* , */, quoteChar /* " */, escapeChar /* \ */)
Visualize json data in a console grid table
You can convert json text to a grid console table.
import fuzzycsv.FuzzyCSVTable
def jsonText = '''
{
"id": "0001",
"type": "donut",
"name": "Cake",
"ppu": 0.55,
"batters":
{
"batter":
[
{ "id": "1001", "type": "Regular" },
{ "id": "1002", "type": "Chocolate","color": "Brown" }
]
},
"topping":
[
{ "id": "5001", "type": "None" },
{ "id": "5002", "type": "Glazed" },
{ "id": "5005", "type": "Sugar" ,"color": "Brown"}
]
}'''
FuzzyCSVTable.fromJsonText(jsonText)
.asListGrid()
.printTable()
Output
╔═════════╤═══════════════════════════════════════════╗ ║ key │ value ║ ╠═════════╪═══════════════════════════════════════════╣ ║ id │ 0001 ║ ╟─────────┼───────────────────────────────────────────╢ ║ type │ donut ║ ╟─────────┼───────────────────────────────────────────╢ ║ name │ Cake ║ ╟─────────┼───────────────────────────────────────────╢ ║ ppu │ 0.55 ║ ╟─────────┼───────────────────────────────────────────╢ ║ batters │ ╔════════╤══════════════════════════════╗ ║ ║ │ ║ key │ value ║ ║ ║ │ ╠════════╪══════════════════════════════╣ ║ ║ │ ║ batter │ ╔══════╤═══════════╤═══════╗ ║ ║ ║ │ ║ │ ║ id │ type │ color ║ ║ ║ ║ │ ║ │ ╠══════╪═══════════╪═══════╣ ║ ║ ║ │ ║ │ ║ 1001 │ Regular │ - ║ ║ ║ ║ │ ║ │ ╟──────┼───────────┼───────╢ ║ ║ ║ │ ║ │ ║ 1002 │ Chocolate │ Brown ║ ║ ║ ║ │ ║ │ ╚══════╧═══════════╧═══════╝ ║ ║ ║ │ ╚════════╧══════════════════════════════╝ ║ ╟─────────┼───────────────────────────────────────────╢ ║ topping │ ╔══════╤════════╤═══════╗ ║ ║ │ ║ id │ type │ color ║ ║ ║ │ ╠══════╪════════╪═══════╣ ║ ║ │ ║ 5001 │ None │ - ║ ║ ║ │ ╟──────┼────────┼───────╢ ║ ║ │ ║ 5002 │ Glazed │ - ║ ║ ║ │ ╟──────┼────────┼───────╢ ║ ║ │ ║ 5005 │ Sugar │ Brown ║ ║ ║ │ ╚══════╧════════╧═══════╝ ║ ╚═════════╧═══════════════════════════════════════════╝
RECORD FUNCTIONS
These Help you write expression or functions for a record.E.g A function multiplying price by quantity.The record function run in two modes:
-
One with type coercion which can be created using
RecordFx.fn{}
.This mode is lenient and does not throw most exceptions.This mode supports division of nulls(null/2
), zero(2/0
) division and type coercion("2"/2 or Object/null
) . This mode adds extra overhead and is much slower if your are dealing with lots of records. -
Another mode is
RecordFx.fx{}
which uses the default groovy evaluator.This mode is much faster if you are working with lots of records.However, this mode is not lenient and hence can throwjava.lang.ArithmeticException: Division by zero
.If you want to enable leniency but still want to use the fasterRecordFx.fx{}
you can wrap your code in thefuzzycsv.FxExtensions
category(e.guse(FxExtensions){ …code here.. }
) So the category is registered only once as compared to the former where the category is reqistered on each and every evaluation.
DATA MERGING
Merging with a fuzzy match
-
Set the accuracy threshold to 75%
-
Merge using code below
import fuzzycsv.FuzzyCSV
import fuzzycsv.FuzzyCSVTable
def csv1 = [
['first name', 'sex'],
['alex', 'male'],
['sara', 'female']
]
def csv2 = [
['ferts nama', 'age', 'sex'],
['alex', '21', 'male'],
['peter', '21', 'male']
]
//set accuracy threshold
FuzzyCSV.ACCURACY_THRESHOLD.set(0.75)
FuzzyCSVTable.tbl(csv1)
.mergeByColumn(csv2)
.printTable()
Output
╔════════════╤════════╤═════╗ ║ first name │ sex │ age ║ ╠════════════╪════════╪═════╣ ║ alex │ male │ - ║ ╟────────────┼────────┼─────╢ ║ sara │ female │ - ║ ╟────────────┼────────┼─────╢ ║ alex │ male │ 21 ║ ╟────────────┼────────┼─────╢ ║ peter │ male │ 21 ║ ╚════════════╧════════╧═════╝
Notice how it merged [first name] and [ferts nama]
Joins
Inner join
tbl(csv1).join(csv2, 'name')
.printTable()
Output
╔══════╤══════╤═════╤════════╗ ║ name │ sex │ age │ hobby ║ ╠══════╪══════╪═════╪════════╣ ║ alex │ male │ 21 │ biking ║ ╚══════╧══════╧═════╧════════╝
Left Join
tbl(csv1).leftJoin(csv2, 'name')
.printTable()
Output
╔══════╤════════╤═════╤════════╗ ║ name │ sex │ age │ hobby ║ ╠══════╪════════╪═════╪════════╣ ║ alex │ male │ 21 │ biking ║ ╟──────┼────────┼─────┼────────╢ ║ sara │ female │ - │ - ║ ╚══════╧════════╧═════╧════════╝
Right Join
tbl(csv1).rightJoin(csv2, 'name')
.printTable()
Output
╔═══════╤══════╤═════╤══════════╗ ║ name │ sex │ age │ hobby ║ ╠═══════╪══════╪═════╪══════════╣ ║ alex │ male │ 21 │ biking ║ ╟───────┼──────┼─────┼──────────╢ ║ peter │ - │ 21 │ swimming ║ ╚═══════╧══════╧═════╧══════════╝
Full Join
tbl(csv1).fullJoin(csv2, 'name')
.printTable()
Output
╔═══════╤════════╤═════╤══════════╗ ║ name │ sex │ age │ hobby ║ ╠═══════╪════════╪═════╪══════════╣ ║ alex │ male │ 21 │ biking ║ ╟───────┼────────┼─────┼──────────╢ ║ sara │ female │ - │ - ║ ╟───────┼────────┼─────┼──────────╢ ║ peter │ - │ 21 │ swimming ║ ╚═══════╧════════╧═════╧══════════╝
Join With Custom Function
tbl(csv1).fullJoin(tbl(csv2)) { it.left('name') == it.right('name') }
.printTable()
Output
╔══════╤════════╤═══════╤═════╤══════════╗ ║ name │ sex │ name │ age │ hobby ║ ╠══════╪════════╪═══════╪═════╪══════════╣ ║ alex │ male │ alex │ 21 │ biking ║ ╟──────┼────────┼───────┼─────┼──────────╢ ║ sara │ female │ - │ - │ - ║ ╟──────┼────────┼───────┼─────┼──────────╢ ║ - │ - │ peter │ 21 │ swimming ║ ╚══════╧════════╧═══════╧═════╧══════════╝
DATA QUERYING
Iterating over records
You can use any of the groovy methods to iterate of a FuzzyCSVTable
since it implements Iterable.
import static fuzzycsv.FuzzyStaticApi.*
def csv = [
['name', 'age', 'hobby'],
['alex', '21', 'biking'],
['peter', '21', 'swimming']
]
assert tbl(csv).collect { it.name } == ['alex', 'peter']
assert tbl(csv).inject('') { acc, record -> acc + record.name + ',' } == 'alex,peter,'
tbl(csv).each { println(it.name) }
Output
alex peter
Get Cell Value or First Value
assert tbl(csv)['name'][0] == 'alex'
assert tbl(csv)['name'][1] == 'peter'
assert tbl(csv).firstCell() == 'alex'
Filter Records
def csv = [
['name', 'age', 'hobby'],
['alex', 21, 'biking'],
['peter', 13, 'swimming']]
tbl(csv).filter { it.name == 'alex' }
.printTable()
Output
╔══════╤═════╤════════╗ ║ name │ age │ hobby ║ ╠══════╪═════╪════════╣ ║ alex │ 21 │ biking ║ ╚══════╧═════╧════════╝
Sorting
def csv2 = [
['name', 'age', 'hobby'],
['alex', '21', 'biking'],
['martin', '40', 'swimming'],
['dan', '25', 'swimming'],
['peter', '21', 'swimming'],
]
def sorted1 = tbl(csv2).sort('age', 'name').printTable()
//or sort using closure
def sorted2 = tbl(csv2).sort { "$it.age $it.name" }.printTable()
Output
---- SORTED WITH COLUMN NAMES ╔════════╤═════╤══════════╗ ║ name │ age │ hobby ║ ╠════════╪═════╪══════════╣ ║ alex │ 21 │ biking ║ ╟────────┼─────┼──────────╢ ║ peter │ 21 │ swimming ║ ╟────────┼─────┼──────────╢ ║ dan │ 25 │ swimming ║ ╟────────┼─────┼──────────╢ ║ martin │ 40 │ swimming ║ ╚════════╧═════╧══════════╝ ---- SORTED WITH CLOSURE ╔════════╤═════╤══════════╗ ║ name │ age │ hobby ║ ╠════════╪═════╪══════════╣ ║ alex │ 21 │ biking ║ ╟────────┼─────┼──────────╢ ║ peter │ 21 │ swimming ║ ╟────────┼─────┼──────────╢ ║ dan │ 25 │ swimming ║ ╟────────┼─────┼──────────╢ ║ martin │ 40 │ swimming ║ ╚════════╧═════╧══════════╝
Ranges
Ranges help slice the table data. e.g. selecting last 2, top 2, 3rd to 2nd last record
def csv = [
['name', 'age', 'hobby'],
['alex', '21', 'biking'],
['martin', '40', 'swimming'],
['dan', '25', 'swimming'],
['peter', '21', 'swimming'],
]
//top 2
def firstTwo = tbl(csv)[1..2].printTable()
//last 2
def lastTwo = tbl(csv)[-1..-2].printTable()
//2nd and 3rd
def middle = tbl(csv)[2..3].printTable()
Output
---- TOP 2 ╔════════╤═════╤══════════╗ ║ name │ age │ hobby ║ ╠════════╪═════╪══════════╣ ║ alex │ 21 │ biking ║ ╟────────┼─────┼──────────╢ ║ martin │ 40 │ swimming ║ ╚════════╧═════╧══════════╝ ---- LAST 2 ╔═══════╤═════╤══════════╗ ║ name │ age │ hobby ║ ╠═══════╪═════╪══════════╣ ║ peter │ 21 │ swimming ║ ╟───────┼─────┼──────────╢ ║ dan │ 25 │ swimming ║ ╚═══════╧═════╧══════════╝ ---- MIDDLE 2 ╔════════╤═════╤══════════╗ ║ name │ age │ hobby ║ ╠════════╪═════╪══════════╣ ║ martin │ 40 │ swimming ║ ╟────────┼─────┼──────────╢ ║ dan │ 25 │ swimming ║ ╚════════╧═════╧══════════╝
Record Navigation
Example showing running sum
@Grab('io.github.kayr:fuzzy-csv:1.9.1-groovy4')
import static fuzzycsv.FuzzyStaticApi.*
def csv = [["name", "age"],
["kay", 1],
["sa", 22],
["kay2", 1],
["ben", 10]]
//add running sum of age
tbl(csv).addColumn(fx("running_sum") { (it.up()?.running_sum ?: 0) + it.age })
.printTable()
Output
╔══════╤═════╤═════════════╗ ║ name │ age │ running_sum ║ ╠══════╪═════╪═════════════╣ ║ kay │ 1 │ 1 ║ ╟──────┼─────┼─────────────╢ ║ sa │ 22 │ 23 ║ ╟──────┼─────┼─────────────╢ ║ kay2 │ 1 │ 24 ║ ╟──────┼─────┼─────────────╢ ║ ben │ 10 │ 34 ║ ╚══════╧═════╧═════════════╝
Or sum bottom value with current value
tbl(csv).addColumn(fx("bottom_up") { (it.down().age ?: 0) + it.age }).printTable()
Cell Navigation
Navigators help move through the table cells easily.You can look above,below, right or left of a cell.
import fuzzycsv.nav.Navigator
import static fuzzycsv.FuzzyStaticApi.tbl
def csv = [
['name', 'age', 'hobby', 'category'],
['alex', '21', 'biking', 'A'],
['peter', '21', 'swimming', 'S'],
['charles', '21', 'swimming', 'S'],
['barbara', '23', 'swimming', 'S']
]
def navigator = new Navigator(0, 0, tbl(csv))
assert navigator.down().down().value() == 'peter'
assert navigator.right().value() == 'age'
assert navigator.right().left().value() == 'name'
assert navigator.down().up().value() == 'name'
// Move down
assert navigator.downIter().collect { it.value() } == ['name', 'alex', 'peter', 'charles', 'barbara']
// MoveRight
assert navigator.rightIter().collect { it.value() } == ['name', 'age', 'hobby', 'category']
//move through all
assert navigator.allIter().collect { it.value() } == ['name', 'age', 'hobby', 'category', 'alex', '21', 'biking', 'A', 'peter', '21', 'swimming',
'S', 'charles', '21', 'swimming', 'S', 'barbara', '23', 'swimming', 'S']
//move through all bounded
assert navigator.allBoundedIter(1, 2).collect { it.value() } == ['name', 'age', 'alex', '21', 'peter', '21']
//move up
assert new Navigator(0, 4, tbl(csv)).upIter().collect { it.value() } == ['barbara', 'charles', 'peter', 'alex', 'name']
Select function
To shoe the difference of how Record.fx
and Record.fn
observe the output of the following code snippet
def csv2 = [
['price', 'quantity'],
['2', 5],
['3', 4]
]
//using the fn functions demonstrates how you can do arithmetic on strings and will get a valid result
def r1 = tbl(csv2)
.select('price', 'quantity', fn('total') { it.price * it.quantity })
.printTable()
// here we will use the fx function notice the output will be a repeated string.
def r2 = tbl(csv2)
.select('price', 'quantity', fx('total') { it.price * it.quantity })
.printTable()
Output
--- USING FN FUNCTION ╔═══════╤══════════╤═══════╗ ║ price │ quantity │ total ║ ╠═══════╪══════════╪═══════╣ ║ 2 │ 5 │ 10 ║ ╟───────┼──────────┼───────╢ ║ 3 │ 4 │ 12 ║ ╚═══════╧══════════╧═══════╝ --- USING FX FUNTION ╔═══════╤══════════╤═══════╗ ║ price │ quantity │ total ║ ╠═══════╪══════════╪═══════╣ ║ 2 │ 5 │ 22222 ║ ╟───────┼──────────┼───────╢ ║ 3 │ 4 │ 3333 ║ ╚═══════╧══════════╧═══════╝
DATA MANIPULATION FUNCTIONS
Delete Records
def csv = [
['name', 'age', 'hobby', 'category'],
['alex', '21', 'biking', 'A'],
['peter', '21', 'swimming', 'S'],
['charles', '21', 'swimming', 'S'],
['barbara', '23', 'swimming', 'S']]
tbl(csv).delete { it.age == '21' }
.printTable()
Output
╔═════════╤═════╤══════════╤══════════╗ ║ name │ age │ hobby │ category ║ ╠═════════╪═════╪══════════╪══════════╣ ║ barbara │ 23 │ swimming │ S ║ ╚═════════╧═════╧══════════╧══════════╝
Deduplicate by column
def csv = [
['name', 'age', 'hobby', 'category'],
['alex', '21', 'biking', 'A'],
['peter', '21', 'swimming', 'S'],
['charles', '21', 'swimming', 'S'],
['barbara', '23', 'swimming', 'S']]
tbl(csv).distinctBy('age', 'category')
.printTable()
Output
╔═════════╤═════╤══════════╤══════════╗ ║ name │ age │ hobby │ category ║ ╠═════════╪═════╪══════════╪══════════╣ ║ alex │ 21 │ biking │ A ║ ╟─────────┼─────┼──────────┼──────────╢ ║ peter │ 21 │ swimming │ S ║ ╟─────────┼─────┼──────────┼──────────╢ ║ barbara │ 23 │ swimming │ S ║ ╚═════════╧═════╧══════════╧══════════╝
Adding Records
def csv = tbl([
['name', 'number'],
['join', 1.1]])
csv.addRecordArr("JB", 455)
.addRecord(["JLis", 767])
.addRecordMap([name: "MName", number: 90])
.addRecordArr()
.addRecordMap([name: "MNameEmp"])
.printTable()
Output
╔══════════╤════════╗ ║ name │ number ║ ╠══════════╪════════╣ ║ join │ 1.1 ║ ╟──────────┼────────╢ ║ JB │ 455 ║ ╟──────────┼────────╢ ║ JLis │ 767 ║ ╟──────────┼────────╢ ║ MName │ 90 ║ ╟──────────┼────────╢ ║ - │ - ║ ╟──────────┼────────╢ ║ MNameEmp │ - ║ ╚══════════╧════════╝
Update values with where clause
def csv2 = [
['name', 'age', 'hobby'],
['alex', '21', 'biking'],
['martin', '40', 'swimming'],
['dan', '25', 'swimming'],
['peter', '21', 'swimming'],
]
tbl(csv2).modify {
set {
it.hobby = "running"
it.age = '900'
}
where {
it.name in ['dan', 'alex']
}
}.printTable()
Output
╔════════╤═════╤══════════╗ ║ name │ age │ hobby ║ ╠════════╪═════╪══════════╣ ║ alex │ 900 │ running ║ ╟────────┼─────┼──────────╢ ║ martin │ 40 │ swimming ║ ╟────────┼─────┼──────────╢ ║ dan │ 900 │ running ║ ╟────────┼─────┼──────────╢ ║ peter │ 21 │ swimming ║ ╚════════╧═════╧══════════╝
Transform each cell record
def table = tbl([
['name', 'age', 'hobby'],
['alex', '21', 'biking'],
['martin', '40', 'swimming'],
['dan', '25', 'swimming'],
['peter', '21', 'swimming'],
])
table.transform { it.padRight(10, '-') }.printTable()
Output
╔════════════╤════════════╤════════════╗ ║ name │ age │ hobby ║ ╠════════════╪════════════╪════════════╣ ║ alex------ │ 21-------- │ biking---- ║ ╟────────────┼────────────┼────────────╢ ║ martin---- │ 40-------- │ swimming-- ║ ╟────────────┼────────────┼────────────╢ ║ dan------- │ 25-------- │ swimming-- ║ ╟────────────┼────────────┼────────────╢ ║ peter----- │ 21-------- │ swimming-- ║ ╚════════════╧════════════╧════════════╝
DATA RESHAPING FUNCTIONS (COLUMN MANIPULATION)
Delete Column
tbl(csv2).delete('name', 'age').printTable()
Output
╔══════════╗ ║ hobby ║ ╠══════════╣ ║ biking ║ ╟──────────╢ ║ swimming ║ ╚══════════╝
Add Column
tbl(csv).addColumn(fx('Double Age') { it.age * 2 })
.printTable()
Output
╔═══════╤═════╤══════════╤════════════╗ ║ name │ age │ hobby │ Double Age ║ ╠═══════╪═════╪══════════╪════════════╣ ║ alex │ 21 │ biking │ 42 ║ ╟───────┼─────┼──────────┼────────────╢ ║ peter │ 13 │ swimming │ 26 ║ ╚═══════╧═════╧══════════╧════════════╝
Transposing
tbl(csv2).transpose()
.printTable()
Output
╔═══════╤════════╤══════════╤══════════╤══════════╗ ║ name │ alex │ martin │ dan │ peter ║ ╠═══════╪════════╪══════════╪══════════╪══════════╣ ║ age │ 21 │ 40 │ 25 │ 21 ║ ╟───────┼────────┼──────────┼──────────┼──────────╢ ║ hobby │ biking │ swimming │ swimming │ swimming ║ ╚═══════╧════════╧══════════╧══════════╧══════════╝
Unwinding a column
This is kind can be used to unwind a column which has nested lists or collections
import static fuzzycsv.FuzzyStaticApi.*
def csv = [
['name', 'AgeList' ],
['biking', [21,16] ],
['swimming', [21,15] ]
]
tbl(csv).unwind('AgeList')
.printTable()
Output
╔══════════╤═════════╗ ║ name │ AgeList ║ ╠══════════╪═════════╣ ║ biking │ 21 ║ ╟──────────┼─────────╢ ║ biking │ 16 ║ ╟──────────┼─────────╢ ║ swimming │ 21 ║ ╟──────────┼─────────╢ ║ swimming │ 15 ║ ╚══════════╧═════════╝
Spreading a column
Expand outwards a column which contains list items
import static fuzzycsv.FuzzyStaticApi.*
def csv = [
['name', 'AgeList' ],
['biking', [21,16] ],
['swimming', [21,15] ]
]
tbl(csv).spread('AgeList')
.printTable()
Output
╔══════════╤═══════════╤═══════════╗ ║ name │ AgeList_1 │ AgeList_2 ║ ╠══════════╪═══════════╪═══════════╣ ║ biking │ 21 │ 16 ║ ╟──────────┼───────────┼───────────╢ ║ swimming │ 21 │ 15 ║ ╚══════════╧═══════════╧═══════════╝
Spread out a column with maps
def csv = [
['name', 'Age'],
['biking', [age: 21, height: 16]],
['swimming', [age: 21, height: 15]]
]
tbl(csv).spread('Age')
.printTable()
Output
╔══════════╤═════════╤════════════╗ ║ name │ Age_age │ Age_height ║ ╠══════════╪═════════╪════════════╣ ║ biking │ 21 │ 16 ║ ╟──────────┼─────────┼────────────╢ ║ swimming │ 21 │ 15 ║ ╚══════════╧═════════╧════════════╝
Spread with custom column names
import static fuzzycsv.FuzzyStaticApi.*
def csv = [
['name', 'Age'],
['biking', [age: 21, height: 16]],
['swimming', [age: 21, height: 15]]
]
tbl(csv).spread(spreader("Age") { col, key -> "MyColName: $key" })
.printTable()
Output
╔══════════╤════════════════╤═══════════════════╗ ║ name │ MyColName: age │ MyColName: height ║ ╠══════════╪════════════════╪═══════════════════╣ ║ biking │ 21 │ 16 ║ ╟──────────┼────────────────┼───────────────────╢ ║ swimming │ 21 │ 15 ║ ╚══════════╧════════════════╧═══════════════════╝
Move column
import static fuzzycsv.FuzzyStaticApi.tbl
def csv = [
['name', 'age', 'hobby', 'category'],
['alex', '21', 'biking', 'A'],
['peter', '21', 'swimming', 'S'],
['charles', '21', 'swimming', 'S'],
['barbara', '23', 'swimming', 'S']
]
tbl(csv).moveCol("age", "category")
.printTable()
Output
╔═════════╤══════════╤══════════╤═════╗ ║ name │ hobby │ category │ age ║ ╠═════════╪══════════╪══════════╪═════╣ ║ alex │ biking │ A │ 21 ║ ╟─────────┼──────────┼──────────┼─────╢ ║ peter │ swimming │ S │ 21 ║ ╟─────────┼──────────┼──────────┼─────╢ ║ charles │ swimming │ S │ 21 ║ ╟─────────┼──────────┼──────────┼─────╢ ║ barbara │ swimming │ S │ 23 ║ ╚═════════╧══════════╧══════════╧═════╝
AGGREGATION FUNCTIONS
Simplistic Aggregations
In the example below we find the average age in each hobby by making use of sum count and group by functions
import static fuzzycsv.FuzzyStaticApi.*
def csv2 = [
['name', 'age', 'Hobby'],
['alex', '21', 'biking'],
['peter', '21', 'swimming'],
['davie', '15', 'swimming'],
['sam', '16', 'biking'],
]
tbl(csv2).summarize(
'Hobby',
sum('age').az('TT.Age'),
count('name').az('TT.Count')
).printTable()
Output
╔══════════╤════════╤══════════╗ ║ Hobby │ TT.Age │ TT.Count ║ ╠══════════╪════════╪══════════╣ ║ biking │ 37 │ 2 ║ ╟──────────┼────────┼──────────╢ ║ swimming │ 36 │ 2 ║ ╚══════════╧════════╧══════════╝
Custom Aggregation
tbl(csv2).summarize(
'Hobby',
reduce { group -> group['age'] }.az('AgeList')
).printTable()
Output
╔══════════╤══════════╗ ║ Hobby │ AgeList ║ ╠══════════╪══════════╣ ║ biking │ [21, 16] ║ ╟──────────┼──────────╢ ║ swimming │ [21, 15] ║ ╚══════════╧══════════╝
EXPORT FUNCTIONS
CSV To MapList
tbl(csv).toMapList()
Output
[[name:alex, age:21, hobby:biking], [name:peter, age:21, hobby:swimming]]
CSV To POJO List
@Grab('io.github.kayr:fuzzy-csv:1.9.1-groovy4')
import static fuzzycsv.FuzzyCSVTable.tbl
def csv = [
['name', 'age', 'hobby'],
['alex', '21', 'biking'],
['peter', '21', 'swimming']]
class Person {
String name
Integer age
String hobby
}
List<Person> people = tbl(csv).toPojoList(Person.class)
assert people.size() == 3
assert people.first().name == 'alex'
assert people.first().age == 21
Sql To CSV
groovy.sql.Sql groovySql = null//...
FuzzyCSVTable.toCSV(groovySql,"select * from PERSON")
OR
ResultSet resultSet = null//..
FuzzyCSVTable.toCSV(resultSet)
EXCEL UTILITY CLASSES
To use the Excel utilities you have to add the poi dependency to your classpath
If you are using gradle add this.
compile 'org.apache.poi:poi-ooxml:3.16', {
exclude group: 'stax', module: 'stax-api'
}
compile 'org.apache.poi:ooxml-schemas:1.3', {
exclude group: 'stax', module: 'stax-api'
}
After this you can use the Excel utilities to convert excel sheets to and from FuzzyCSVTables.
There are mainly two classes that help with this which include fuzzycsv.Excel2Csv
and fuzzycsv.CSVToExcel
Note:
This library has not been tested with very large(700,000 records plus) CSV files.So performance might be a concern.
More example can be seen here
and