Home‎ > ‎Hadoop Ecosystem‎ > ‎Atmospheric Layers‎ > ‎

Data Cleansing / Integration


He lists five available technologies for smarter data preparation work:

  • Data Tamer, which focuses on integration and is still being developed at MIT.
  • Open Refine, formerly Google Refine, which helps with clean-up.
  • Data Wrangler, a cleaning and transformation tool developed by Stanford.
  • Reshape2 packages, which let you restructure and aggregate data.
  • Plyr, which uses a split-apply-combine strategy for R.

http://www.cloveretl.com/ - CloverETL® is data integration platform scaling from open source desktop to a commercial cloud cluster.
It's a Java-based open platform that helps design, automate, and monitor data integration processes.