You Here : Home > Linux
Waffles 20090927
OS Support:
Publisher
Publisher
Old version
Update:
January 17 2012
Download:
Waffles 20090927
Description
The Waffles project offers a collection of commandline tools for researchers in machine learning, Data Mining, and related fields. All of the functionality is also provided in a clean C++ class library. Demo apps are included to show how to use the class library.Some Quick Examples of using Waffles:The Waffles tools mostly work with .arff files. Let's say you have a database or a spreadsheet that can export to commaseparatedvalues, and you want to convert it to a .arff file.Command:waffles_transform import mydata.csv > mydata.arffYou might want to see some quick stats about a dataset.Command:waffles_plot stats IRIS.arffAnother quick way to look at a dataset is to look at a matrix of pairwise plots and look for correlated attributes. (Obviously every attribute is correlated with itself, so we show histograms of the attributes along the diagonal.) Only a small part of the plot is actually shown here.Command:waffles_plot overview diabetes.arffMaybe you'll need to tweak your dataset. You can swap columns, fill in missing values, sort in a particular column, shuffle rows, and numerous other useful transformations.Command:waffles_transform swapcolumns mydata.arff 0 3waffles_transform replacemissingvalues mydata.arffwaffles_transform sortcolumn mydata.arff 2waffles_transform shuffle mydata.arffLet's do some basic machine learning stuff. We'll use 50x2 crossvalidation to test the predictive accuracy of various models on the iris dataset. We'll use baseline, a decision tree, an ensemble of 30 decision trees, a 3NN instance learner, a 5NN instance learner, naive bayes, a perceptron, and a neural network with one hidden layer of 4 nodes. (Many other models are available, but are not demonstrated here.)Command:waffles_learn crossvalidate reps 50 folds 2 iris.arff baselinewaffles_learn crossvalidate reps 50 folds 2 iris.arff decisiontreewaffles_learn crossvalidate reps 50 folds 2 iris.arff bag 30 decisiontree endwaffles_learn crossvalidate reps 50 folds 2 iris.arff knn 3waffles_learn crossvalidate reps 50 folds 2 iris.arff knn 5waffles_learn crossvalidate reps 50 folds 2 iris.arff discretize naivebayeswaffles_learn crossvalidate reps 50 folds 2 iris.arff orthogonalize neuralnetwaffles_learn crossvalidate reps 50 folds 2 iris.arff orthogonalize neuralnet addlayer 4In this example, we will train a neural network with two hidden layers (each with 4 nodes). We will wrap the neural network in an orthogonalization filter so it can handle nominal data as well as continuous data. We will save the trained model to a file (model.twt). Then, we'll load that model from the file and use it to evaluate labels for all the patterns in a test set.Command:waffles_learn train train.arff orthogonalize neuralnet addlayer 4 addlayer 4 > model.twtwaffles_learn evaluate model.twt test.arff > predictions.arffYou can use Waffles to plot equations. This is a simple 2D plot of the logistic sigmoid function. (By default, an image named plot.png is generated. You can view it with your favorite image viewer.)Command:waffles_plot equation range 6 0 6 1 "f1(x) = 1/(1+e^(x))"Let's plot multiple equations together. Notice that I define a helperfunction, g(x). Of course you can use common operations like: abs, acos, acosh, asin, asinh, atan, atanh, ceil, cos, cosh, erf, floor, gamma, lgamma, log, max, min, sin, sinh, sqrt, tan, and tanh. You can also overload those operations, define constants, etc.Command:waffles_plot equation range 10 0 10 10 "f1(x)=log(x^2+1)+2;f2(x)= x^2/g(x)+2;g(m)=10*(cos(m)+pi);f3(x)=sqrt(49x^2);f4(x)=abs(x)1"Suppose you want to make a precisionRecall graph for an ensemble of 100 random decision trees with the diabetes database. Here's how you could do this. (The horizontal axis shows the recall, and the vertical axis shows the corresponding precision for each of the labels. Blue shows the precision when trying to identify the cases most likely to test negative for diabetes. Red shows the precision when trying to identify the cases most likely to test positive for diabetes. Apparently, Random Forest finds the former task to be easier.)Command:waffles_learn precisionrecall diabetes.arff bag 100 decisiontree random end > pr.arffwaffles_plot scatter pr.arff linesLet's generate 2000 points that lie on a SWISS roll manifold. Since 3D stuff can be hard to visualize sometimes, we'll plot it from several different points of view.Command:waffles_generate swissroll 2000 cutoutstar seed 0 > sr.arffwaffles_plot 3d sr.arff Blast pointradius 300Now, let's generate and plot a collection of 2000 points that lie on a selfintersecting ribbon manifold.Command:waffles_generate selfintersectingribbon 2000 seed 1 > in.arffwaffles_plot 3d in.arffNext, we'll use Manifold Sculpting to learn that selfintersecting ribbon manifold (We'll use 12 neighbors, 2 target dims, an intelligent neighborfinding algorithm, a shortcutpruning algorithm, and a slow scaling rate).Command:waffles_transform manifoldsculpting in.arff 12 2 smartneighbors pruneshortcuts scalerate 0.9995 seed 0 > out.arffwaffles_plot scatter out.arff spectrum pointradius 5 nohorizaxislabels novertaxislabelsWe'll draw 1 million random values from a gamma distribution (alpha=9, beta=2) and then plot a histogram of those values. Other supported distributions include: beta, binomial, cauchy, chisquare, exponential, f, gamma, gaussian, geometric, logistic, lognormal, normal, poisson, softimpulse, spherical, student, uniform, weibull.Command:waffles_generate noise 1000000 seed 0 dist gamma 9 2 > gamma.arffwaffles_plot histogram gamma.arffWaffles supports many useful filters. For example, if you have some highdimensional data, but your algorithm works better with lowdimensional data, filter it through "pca". If you have data with realvalues, but your algorithm only supports discrete values, filter it through "discretize". If your algorithm only supports real values, but you have nominal data, filter it with "orthogonalize". If your data is not within the ideal range, filter with "normalize". These filters work in both directions, and you can specify whether they apply to features, labels, or both.Command:waffles_learn crossvalidate data.arff pca 7 knn 5waffles_learn crossvalidate data.arff discretize naivebayeswaffles_learn crossvalidate data.arff orthogonalize meanmarginswaffles_learn crossvalidate data.arff normalize range 0 1 somealgorithmIf you don't know which algorithm to use, but you've got cycles to burn, crossvalidationselecting ensembles are always powerful. For really strong results, you can even make a cvselect ensemble of bagging ensembles.Command:waffles_learn splittest trainratio 0.3 cvselect knn 5 orthogonalize neuralnet decisiontree discretize naivebayes endwaffles_learn splittest trainratio 0.2 bag 50 cvselect decisiontree meanmarginstree end endSome algorithms have no internal model. You cannot train such algorithms, but you can still measure their predictive accuracy.Command:waffles_learn transacc train.arff test.arff agglomerativetransducerwaffles_learn transacc train.arff test.arff graphcuttransducer 5waffles_learn transacc train.arff test.arff neighbortransducer 5Matrix operations are also supported. For example, let's compute C=ATB†.Command:waffles_transform transpose a.arff > a_trans.arffwaffles_transform pseudoinverse b.arff > b_inv.arffwaffles_transform multiply a_trans.arff b_inv.arff > c.arff...and lots more.The Waffles class library also has a lot of functionality that is not yet available through the commandline tools. Here is an incomplete list of some of the things it can do: * Agent Algorithms * Arff Tools * Bagging * Calibration * Chess * Clustering * CrossValidation Selection * Data Augmentation * Data Mining Tools * Decision Trees * Demos * Evolutionary Optimizer * Fourier Transform * Gaussian Mixture Model * Graph Cut * GUI Tools * Hidden Markov Models * Hill Climbers * Hierarchical Region Adjacency Graphs * Image Processing Tools * kdTree * kMeans * kNN Instance Learner * Linear Regression * Manifold Learning * Multivariate Polynomials * MCMC for belief networks * Naive Bayes * Neural Network * Particle Swarm * Plotting * Precision/Recall * Principle Component Analysis * QLearning * Ray Tracer * Self Organizing Map * Significance Testing * Socket Wrappers * Stemmer...and more (see the documentation) What's New in This Release: [ read full changelog ] · Added the LocallyLinear Embedding (LLE) to the transform tool and improved the Breadth First Unfolding manifold learning algorithm. · Added the Kabsch algorithm for aligning data. · Added singular value decomposition to the transform tool. · Improved api docs. · Further simplified the learning interface. · Repaired some regressions with serialization. · Added several unit tests.
Related Downloads
Software Tags
Waffles  Waffles 20090927  Waffles Free Download  Waffles Reviews  Download WafflesSize :
Next/Pre

console_reader 1.0
IEP 1.0
Advertisement
SoftWare Download
 Moneydance for Linux 2014.3 B947
 SecurityCam 1.5.0.8
 Big Faceless PDF Library 2.13.1
 Setup Utility Ver. 2.3.1 for Windows 7/7 x64/Vista/Vista x64/XP/2000/Me/98
 DVDFab Passkey 8.0.8.7
 couponamazing 5.0
 ID Card Workshop 3.6.0.1
 REGZA LX830 Win7 64bit: TOSHIBA eco Utility
 XML::WBXML 0.03
 iMovie BlackBerry Converter 1.6.2.2