Home‎ > ‎Hadoop Ecosystem‎ > ‎Atmospheric Layers‎ > ‎

Data Cleansing / Integration

http://www.itbusinessedge.com/blogs/integration/five-new-tools-for-smarter-big-data-integration.html


He lists five available technologies for smarter data preparation work:

  • Data Tamer, which focuses on integration and is still being developed at MIT.
  • Open Refine, formerly Google Refine, which helps with clean-up.
  • Data Wrangler, a cleaning and transformation tool developed by Stanford.
  • Reshape2 packages, which let you restructure and aggregate data.
  • Plyr, which uses a split-apply-combine strategy for R.


http://www.cloveretl.com/ - CloverETL® is data integration platform scaling from open source desktop to a commercial cloud cluster.
It's a Java-based open platform that helps design, automate, and monitor data integration processes.

Comments