diff --git a/LICENSE b/LICENSE
new file mode 100644
index 0000000000000000000000000000000000000000..706fe7b361c14d5b295a001a2cb2679f0da2b908
--- /dev/null
+++ b/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2023 
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
index 1c916891bd4d830d1a931180bd2e4e8102912b76..48503bba4f72dae0133056ce1ecee0dda9b44012 100644
--- a/README.md
+++ b/README.md
@@ -1,92 +1,36 @@
-# Classification with dictionary learning and a distance barrier promoting incoherence
+# Classification with dictionary learning and a distance barrier promoting incoherence (class-idb)
 
+We present a new approach to the incoherent dictionary learning problem using a barrier function that promotes incoherence.
+This function has a context-dependent quadratic term and a distance barrier term which can be used in both local and global structures.
+This strategy achieves better results in terms of error representation and incoherence of the dictionary, compared with the standard problem.
+We demonstrate on several datasets that this function can improve the performance of dictionaries in classification problems.
 
+## Citing Us
 
-## Getting started
-
-To make it easy for you to get started with GitLab, here's a list of recommended next steps.
-
-Already a pro? Just edit this README.md and make it your own. Want to make it easy? [Use the template at the bottom](#editing-this-readme)!
-
-## Add your files
-
-- [ ] [Create](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#create-a-file) or [upload](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#upload-a-file) files
-- [ ] [Add files using the command line](https://docs.gitlab.com/ee/gitlab-basics/add-file.html#add-a-file-using-the-command-line) or push an existing Git repository with the following command:
+This represents the main resources necessary for reproducing the results presented in âClassification with dictionary learning and a distance barrier promoting incoherenceâ, by Denis C. Ilie Ablachim, and Bogdan Dumitrescu. If you use our work please cite the following paper
 
 ```
-cd existing_repo
-git remote add origin https://gitlab.cs.pub.ro/asydil/classification-with-dictionary-learning-and-a-distance-barrier-promoting-incoherence.git
-git branch -M main
-git push -uf origin main
+@inproceedings{class-idb,
+  title={Classification with dictionary learning and a distance barrier promoting incoherence},
+  author={Ilie-Ablachim, Denis C and Dumitrescu, Bogdan},
+  booktitle={33rd IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2023)},
+  year={2023},
+  organization={IEEE}
+}
 ```
 
-## Integrate with your tools
-
-- [ ] [Set up project integrations](https://gitlab.cs.pub.ro/asydil/classification-with-dictionary-learning-and-a-distance-barrier-promoting-incoherence/-/settings/integrations)
-
-## Collaborate with your team
-
-- [ ] [Invite team members and collaborators](https://docs.gitlab.com/ee/user/project/members/)
-- [ ] [Create a new merge request](https://docs.gitlab.com/ee/user/project/merge_requests/creating_merge_requests.html)
-- [ ] [Automatically close issues from merge requests](https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#closing-issues-automatically)
-- [ ] [Enable merge request approvals](https://docs.gitlab.com/ee/user/project/merge_requests/approvals/)
-- [ ] [Automatically merge when pipeline succeeds](https://docs.gitlab.com/ee/user/project/merge_requests/merge_when_pipeline_succeeds.html)
-
-## Test and Deploy
-
-Use the built-in continuous integration in GitLab.
-
-- [ ] [Get started with GitLab CI/CD](https://docs.gitlab.com/ee/ci/quick_start/index.html)
-- [ ] [Analyze your code for known vulnerabilities with Static Application Security Testing(SAST)](https://docs.gitlab.com/ee/user/application_security/sast/)
-- [ ] [Deploy to Kubernetes, Amazon EC2, or Amazon ECS using Auto Deploy](https://docs.gitlab.com/ee/topics/autodevops/requirements.html)
-- [ ] [Use pull-based deployments for improved Kubernetes management](https://docs.gitlab.com/ee/user/clusters/agent/)
-- [ ] [Set up protected environments](https://docs.gitlab.com/ee/ci/environments/protected_environments.html)
-
-***
-
-# Editing this README
-
-When you're ready to make this README your own, just edit this file and use the handy template below (or feel free to structure it however you want - this is just a starting point!). Thank you to [makeareadme.com](https://www.makeareadme.com/) for this template.
-
-## Suggestions for a good README
-Every project is different, so consider which of these sections apply to yours. The sections used in the template are suggestions for most open source projects. Also keep in mind that while a README can be too long and detailed, too long is better than too short. If you think your README is too long, consider utilizing another form of documentation rather than cutting out information.
-
-## Name
-Choose a self-explaining name for your project.
-
-## Description
-Let people know what your project can do specifically. Provide context and add a link to any reference visitors might be unfamiliar with. A list of Features or a Background subsection can also be added here. If there are alternatives to your project, this is a good place to list differentiating factors.
-
-## Badges
-On some READMEs, you may see small images that convey metadata, such as whether or not all the tests are passing for the project. You can use Shields to add some to your README. Many services also have instructions for adding a badge.
-
-## Visuals
-Depending on what you are making, it can be a good idea to include screenshots or even a video (you'll frequently see GIFs rather than actual videos). Tools like ttygif can help, but check out Asciinema for a more sophisticated method.
-
-## Installation
-Within a particular ecosystem, there may be a common way of installing things, such as using Yarn, NuGet, or Homebrew. However, consider the possibility that whoever is reading your README is a novice and would like more guidance. Listing specific steps helps remove ambiguity and gets people to using your project as quickly as possible. If it only runs in a specific context like a particular programming language version or operating system or has dependencies that have to be installed manually, also add a Requirements subsection.
-
-## Usage
-Use examples liberally, and show the expected output if you can. It's helpful to have inline the smallest example of usage that you can demonstrate, while providing links to more sophisticated examples if they are too long to reasonably include in the README.
-
-## Support
-Tell people where they can go to for help. It can be any combination of an issue tracker, a chat room, an email address, etc.
+## Datasets
 
-## Roadmap
-If you have ideas for releases in the future, it is a good idea to list them in the README.
+Most of the datasets are available [here](http://www.zhuolin.umiacs.io/projectlcksvd.html).
 
-## Contributing
-State if you are open to contributions and what your requirements are for accepting them.
+The action bank features can be downloaded from [here](https://cse.buffalo.edu/~jcorso/r/actionbank/).
 
-For people who want to make changes to your project, it's helpful to have some documentation on how to get started. Perhaps there is a script that they should run or some environment variables that they need to set. Make these steps explicit. These instructions could also be useful to your future self.
+We provide some extractors to organize the *HMDB51* and *UCF50* datasets as mat files efficiently.
 
-You can also document commands to lint the code or run tests. These steps help to ensure high code quality and reduce the likelihood that the changes inadvertently break something. Having instructions for running tests is especially helpful if it requires external setup, such as starting a Selenium server for testing in a browser.
+## Reproducing the results
 
-## Authors and acknowledgment
-Show your appreciation to those who have contributed to the project.
+The proposed methods are available under the following scripts: *aksvd_idb_test.m* and *aksvd_itdb_test.m*.
 
-## License
-For open source projects, say how it is licensed.
+## Funding
 
-## Project status
-If you have run out of energy or time for your project, put a note at the top of the README saying that development has slowed down or stopped completely. Someone may choose to fork your project or volunteer to step in as a maintainer or owner, allowing your project to keep going. You can also make an explicit request for maintainers.
+This work is supported by a grant of the Ministry of Research, Innovation and Digitization, CNCS â UEFISCDI, project number PN-III-P4-PCE-2021-0154, within PNCDI III.
diff --git a/aksvd_idb.m b/aksvd_idb.m
new file mode 100644
index 0000000000000000000000000000000000000000..8ec7c2b298b7c0b6a3dd61abc7fa334f56768ef5
--- /dev/null
+++ b/aksvd_idb.m
@@ -0,0 +1,108 @@
+function [accuracy, trainTime, testTime, D_all] = aksvd_idb(X_train, y_train,...
+    X_test, y_test, n_components, n_nonzero_coefs, n_iterations, M, gamma,...
+    lambda, seed)
+
+    % Prepare seed
+    rng(seed)
+
+    % Dataset properties
+    n_classes = length(unique(y_train));
+    n_features = size(X_train, 1);
+    desired_coh = 1 - M/2;
+
+    X_all = cell(1, n_classes);
+    D_all = cell(1, n_classes);
+    for i_class = 1:n_classes
+        D_all{i_class} = normcol_equal(randn(size(X_train,1), n_components));
+    end
+
+    % Start waitbar
+    trainTime = 0;
+    wb = waitbar(0, '[IDB] Training...');
+
+    for i_iter = 1:n_iterations
+        tmpTime = tic;
+        for i_class = 1:n_classes
+            % Coding method
+            X_all{i_class} = omp(X_train(:, y_train==i_class), D_all{i_class}, n_nonzero_coefs);
+
+            % Learning method
+            Y = X_train(:, y_train==i_class);
+            D = D_all{i_class};
+            X = X_all{i_class};
+            E = Y - D * X;
+
+            Dc = [];
+            for tmp_i_class = 1:n_classes
+               if tmp_i_class ~= i_class
+                  Dc = [Dc D_all{tmp_i_class}];
+               end
+            end
+
+            for j = 1:n_components
+                [~, data_indices, x] = find(X(j,:));
+
+                if (isempty(data_indices))
+                    d = randn(n_features, 1);
+                    D(:, j) = d / norm(d);
+                else
+                    d = D(:, j);
+                    [g1, g2] = compute_barrier_grad(d, Dc, desired_coh);
+
+                    F = E(:, data_indices) + d * x;
+                    d = F*x' - gamma*(g1 + lambda*g2);
+
+                    D(:, j) = d / norm(d);
+                    X(j, data_indices) = F'*D(:, j);
+                    E(:, data_indices) = F - D(:, j)*X(j, data_indices);
+                end
+            end
+
+            D_all{i_class} = D;
+            X_all{i_class} = X;
+        end
+        trainTime = trainTime + toc(tmpTime);
+
+        % Update waitbar
+        waitbar(i_iter/n_iterations, wb, sprintf('[IDB] Training - Remaining time: %d [sec]',...
+                round(trainTime/i_iter*(n_iterations - i_iter))));
+    end
+
+    % Close waitbar
+    close(wb);
+
+
+    % Start waitbar
+    testTime = 0;
+    wb = waitbar(0, '[IDB] Testing...');
+
+    Errs = [];
+    prediction = [];
+    for i_test = 1:size(X_test, 2)
+
+        tmpTime = tic;
+        errs = [];
+        for i_class = 1:n_classes
+            x = omp(X_test(:, i_test), D_all{i_class}, n_nonzero_coefs);
+            errs = [errs norm(X_test(:, i_test) - D_all{i_class} * x)];
+        end
+
+        Errs = [Errs; errs];
+        [~, index] = min(errs);
+        prediction = [prediction index];
+        testTime = testTime + toc(tmpTime);
+
+       % Update waitbar
+       waitbar(i_test/size(X_test, 2), wb, sprintf('[IDB] Testing - Remaining time: %d [sec]',...
+               round(testTime/i_test*(size(X_test, 2) - i_test))));
+    end
+
+    % Close waitbar
+    close(wb);
+
+
+    % Compute problem accuracy
+    accuracy = sum(y_test==prediction)/size(X_test,2);
+end
+
+
diff --git a/aksvd_idb_test.m b/aksvd_idb_test.m
new file mode 100644
index 0000000000000000000000000000000000000000..8edeca6d127d6407cb0cfd8da7a004454153c148
--- /dev/null
+++ b/aksvd_idb_test.m
@@ -0,0 +1,47 @@
+clc;
+clear;
+close all;
+
+n_components = 40;
+n_nonzero_coefs = 20;
+n_iterations = 10;
+
+M = 1.2;                % barrier margin
+gamma = 0.05; %0.5;            % trade-off factors
+lambda = 1E+2;%25;
+desired_coh = 1 - M/2;
+
+rng_seed = 1;
+yaleb_number = 32;
+ar_face_number = 20;
+caltech_101_number = 30;
+scene_15_number = 30;
+cmupie_number = 30;
+hmdb51_number = 30;
+
+DataPath = 'randomfaces4extendedyaleb'; number = yaleb_number;
+% DataPath = 'randomfaces4ar'; number = ar_face_number;
+% DataPath = 'spatialpyramidfeatures4caltech101'; number = caltech_101_number;
+% DataPath = 'spatialpyramidfeatures4scene15'; number = scene_15_number;
+% DataPath = 'CMUPIE_random_256'; number = cmupie_number;
+% DataPath = 'hmdb51_dataset'; number = hmdb51_number;
+load(fullfile('dbs', DataPath));
+
+[TrData, TtData, TrLabel, TtLabel] = extract_data(featureMat, labelMat,...
+                                                  number, rng_seed);
+
+% Data preprocessing
+y_train = TrLabel;
+y_test = TtLabel;
+X_train = TrData;
+X_test = TtData;
+
+[accuracy, trainTime, testTime, D_all] = aksvd_idb(X_train, y_train,...
+  X_test, y_test, n_components, n_nonzero_coefs, n_iterations, M, gamma, lambda,...
+  'default');
+
+fprintf("[IDB] Accuracy: %f\n", accuracy);
+fprintf('[IDB] Training time: %f [sec]\n', trainTime);
+fprintf('[IDB] Testing time: %f [sec]\n', testTime);
+fprintf('\n');
+
diff --git a/aksvd_idb_test_n_rounds.m b/aksvd_idb_test_n_rounds.m
new file mode 100644
index 0000000000000000000000000000000000000000..bc25679bfd4f87f730df3524a87b4b1a72c3b8ab
--- /dev/null
+++ b/aksvd_idb_test_n_rounds.m
@@ -0,0 +1,122 @@
+clc;
+clear;
+close all;
+
+n_rounds = 10;
+n_components = 40;
+n_nonzero_coefs = 20;
+n_iterations = 10;
+file = fopen('logs_idb.txt', 'a');
+
+yaleb_number = 32;
+ar_face_number = 20;
+caltech_101_number = 30;
+scene_15_number = 30;
+cmupie_number = 30;
+ucf50_number = 30;
+hmdb51_number = 30;
+
+DataPaths = [
+%     "randomfaces4extendedyaleb",...
+%     "randomfaces4ar",...
+%     "spatialpyramidfeatures4caltech101",...
+%     "spatialpyramidfeatures4scene15",...
+%     "CMUPIE_random_256",...
+%     "ucf50_dataset"...
+    "hmdb51_dataset"
+];
+
+samples_numbers = [
+%     yaleb_number,...
+%     ar_face_number,...
+%     caltech_101_number,...
+%     scene_15_number,...
+%     cmupie_number,...
+%     ucf50_number,...
+    hmdb51_number
+];
+
+for i_DataPath = 1:length(DataPaths)
+    DataPath = DataPaths(i_DataPath);
+    number = samples_numbers(i_DataPath);
+    load(fullfile('./dbs', DataPath));
+    
+    % Choose parameters by db
+    switch DataPath
+        case 'randomfaces4extendedyaleb'
+            M = 1.4;
+            lambda = 500;
+            gamma = 0.05;
+        case 'randomfaces4ar'
+            M = 1.2;
+            lambda = 0.5;
+            gamma = 5000;
+        case 'spatialpyramidfeatures4caltech101'   
+            M = 1.2;
+            lambda = 0.5;
+            gamma = 0.005;
+        case 'ucf50_dataset'
+            M = 1.2;
+            lambda = 500;
+            gamma = 500;
+        case 'CMUPIE_random_256'
+            M = 1.4;
+            lambda = 500;
+            gamma = 0.5;
+        case 'spatialpyramidfeatures4scene15'
+            M = 1.4;
+            lambda = 0.005;
+            gamma = 0.005;
+        case 'hmdb51_dataset'
+            M = 1.2;
+            lambda = 5000;
+            gamma = 500;
+    end
+    
+    T_Accuracy = 0;
+    T_TrTime = 0;
+    T_TtTime = 0;
+    
+    for i_round = 1:n_rounds
+        [TrData, TtData, TrLabel, TtLabel] = extract_data(featureMat, labelMat,number, i_round);
+
+        % Data preprocessing
+        y_train = TrLabel;
+        y_test = TtLabel;
+        X_train = TrData;
+        X_test = TtData;
+
+        [accuracy, trainTime, testTime, D_all] = aksvd_idb(X_train, y_train,...
+          X_test, y_test, n_components, n_nonzero_coefs, n_iterations, M, gamma, lambda,...
+          i_round);
+
+        fprintf('[IDB] Accuracy: %f\n', accuracy);
+        fprintf('[IDB] Training time: %f [sec]\n', trainTime);
+        fprintf('[IDB] Testing time: %f [sec]\n', testTime);
+        fprintf('\n');
+
+        fprintf(file, '[IDB] Accuracy: %f\n', accuracy);
+        fprintf(file, '[IDB] Training time: %f [sec]\n', trainTime);
+        fprintf(file, '[IDB] Testing time: %f [sec]\n', testTime);
+        fprintf(file, '\n');
+        
+        T_Accuracy = T_Accuracy + accuracy;
+        T_TrTime = T_TrTime + trainTime;
+        T_TtTime = T_TtTime + testTime;
+    end
+    
+    T_Accuracy = T_Accuracy / n_rounds;
+    T_TrTime = T_TrTime / n_rounds;
+    T_TtTime = T_TtTime / n_rounds;
+    
+    fprintf(file, '\n%s\n', DataPath);
+    fprintf(file, 'The running time for IDB training is : %.03f\n', T_TrTime);
+    fprintf(file, 'The running time for IDB testing is : %.03f\n', T_TtTime);
+    fprintf(file, 'Recognition rate for IDB is : %.04f\n', T_Accuracy);
+    fprintf(file, '\n\n');
+    fprintf(file, '======================================================');
+    fprintf(file, '\n\n');
+
+end
+
+fclose(file);
\ No newline at end of file
diff --git a/aksvd_idl.m b/aksvd_idl.m
new file mode 100644
index 0000000000000000000000000000000000000000..dc621c3c355ba3c5092fff2b59b0d8928d59bea1
--- /dev/null
+++ b/aksvd_idl.m
@@ -0,0 +1,101 @@
+function [accuracy, trainTime, testTime, D_all] = aksvd_idl(X_train, y_train,...
+  X_test, y_test, gamma, n_components, n_nonzero_coefs, n_iterations, seed)
+
+    % Prepare seed
+    rng(seed)
+
+    % Dataset properties
+    n_classes = length(unique(y_train));
+
+    X_all = cell(1, n_classes);
+    D_all = cell(1, n_classes);
+    for i_class = 1:n_classes
+        D_all{i_class} = normcol_equal(randn(size(X_train,1), n_components));
+    end
+
+    % Start waitbar
+    trainTime = 0;
+    wb = waitbar(0, '[IDL] Training...');
+
+    for i_iter = 1:n_iterations
+        tmpTime = tic;
+        for i_class = 1:n_classes
+            % Coding method
+            X_all{i_class} = omp(X_train(:, y_train==i_class), D_all{i_class}, n_nonzero_coefs);
+
+            % Learning method
+            Y = X_train(:, y_train==i_class);
+            D = D_all{i_class};
+            X = X_all{i_class};
+            E = Y - D * X;
+
+            Dc = [];
+            for tmp_i_class = 1:n_classes
+               if tmp_i_class ~= i_class
+                  Dc = [Dc D_all{tmp_i_class}];
+               end
+            end
+
+            for i_atom = 1:size(D, 2)
+                [~, atom_usages, ~] = find(X(i_atom,:));
+
+                if (isempty(atom_usages))
+                    D(:, i_atom) = randn(size(D,1), 1);
+                    D(:, i_atom) = D(:, i_atom) / norm(D(:, i_atom));
+                else
+                    F = E(:, atom_usages) + D(:, i_atom) * X(i_atom, atom_usages);
+                    d = F * X(i_atom, atom_usages)' - 2 * gamma * Dc * (Dc' * D(:, i_atom));
+                    D(:, i_atom) = d / norm(d);
+                    X(i_atom, atom_usages) = F' * D(:, i_atom);
+                    E(:, atom_usages) = F - D(:, i_atom) * X(i_atom, atom_usages);
+                end
+            end
+
+            D_all{i_class} = D;
+            X_all{i_class} = X;
+        end
+        trainTime = trainTime + toc(tmpTime);
+
+        % Update waitbar
+        waitbar(i_iter/n_iterations, wb, sprintf('[IDL] Training - Remaining time: %d [sec]',...
+                round(trainTime/i_iter*(n_iterations - i_iter))));
+    end
+
+    % Close waitbar
+    close(wb);
+
+
+    % Start waitbar
+    testTime = 0;
+    wb = waitbar(0, '[IDL] Testing...');
+
+    Errs = [];
+    prediction = [];
+    for i_test = 1:size(X_test, 2)
+
+        tmpTime = tic;
+        errs = [];
+        for i_class = 1:n_classes
+            x = omp(X_test(:, i_test), D_all{i_class}, n_nonzero_coefs);
+            errs = [errs norm(X_test(:, i_test) - D_all{i_class} * x)];
+        end
+
+        Errs = [Errs; errs];
+        [~, index] = min(errs);
+        prediction = [prediction index];
+        testTime = testTime + toc(tmpTime);
+
+       % Update waitbar
+       waitbar(i_test/size(X_test, 2), wb, sprintf('[IDL] Testing - Remaining time: %d [sec]',...
+               round(testTime/i_test*(size(X_test, 2) - i_test))));
+    end
+
+    % Close waitbar
+    close(wb);
+
+
+    % Compute problem accuracy
+    accuracy = sum(y_test==prediction)/size(X_test,2);
+end
+
+
diff --git a/aksvd_idl_test.m b/aksvd_idl_test.m
new file mode 100644
index 0000000000000000000000000000000000000000..1e0b81cdf4d7249727b74d1305c3fd07f027d0cc
--- /dev/null
+++ b/aksvd_idl_test.m
@@ -0,0 +1,44 @@
+clc;
+clear;
+close all;
+
+gamma = 4;
+n_components = 40;
+n_nonzero_coefs = 20;
+n_iterations = 10;
+
+rng_seed = 1;
+yaleb_number = 32;
+ar_face_number = 20;
+caltech_101_number = 30;
+scene_15_number = 30;
+cmupie_number = 30;
+ucf50_number = 30;
+hmdb51_number = 30;
+
+DataPath = 'randomfaces4extendedyaleb'; number = yaleb_number;
+% DataPath = 'randomfaces4ar'; number = ar_face_number;
+% DataPath = 'spatialpyramidfeatures4caltech101'; number = caltech_101_number;
+% DataPath = 'spatialpyramidfeatures4scene15'; number = scene_15_number;
+% DataPath = 'CMUPIE_random_256'; number = cmupie_number;
+% DataPath = 'ucf50_dataset'; number = ucf50_number;
+% DataPath = 'hmdb51_dataset'; number = hmdb51_number;
+load(fullfile('dbs', DataPath));
+
+[TrData, TtData, TrLabel, TtLabel] = extract_data(featureMat, labelMat,...
+                                                  number, rng_seed);
+
+% Data preprocessing
+y_train = TrLabel;
+y_test = TtLabel;
+X_train = TrData;
+X_test = TtData;
+
+[accuracy, trainTime, testTime, D_all] = aksvd_idl(X_train, y_train,...
+  X_test, y_test, gamma, n_components, n_nonzero_coefs, n_iterations, 'default');
+
+fprintf("[IDL] Accuracy: %f\n", accuracy);
+fprintf('[IDL] Training time: %f [sec]\n', trainTime);
+fprintf('[IDL] Testing time: %f [sec]\n', testTime);
+fprintf('\n');
+
diff --git a/aksvd_idl_test_n_rounds.m b/aksvd_idl_test_n_rounds.m
new file mode 100644
index 0000000000000000000000000000000000000000..61ce8071d6d2972ba5ca17c869ee913a2949de7f
--- /dev/null
+++ b/aksvd_idl_test_n_rounds.m
@@ -0,0 +1,106 @@
+clc;
+clear;
+close all;
+
+n_rounds = 10;
+n_components = 40;
+n_nonzero_coefs = 20;
+n_iterations = 10;
+file = fopen('logs_idl.txt', 'a');
+
+yaleb_number = 32;
+ar_face_number = 20;
+caltech_101_number = 30;
+scene_15_number = 30;
+cmupie_number = 30;
+ucf50_number = 30;
+hmdb51_number = 30;
+
+DataPaths = [
+%     "randomfaces4extendedyaleb",...
+%     "randomfaces4ar",...
+%     "spatialpyramidfeatures4caltech101",...
+%     "spatialpyramidfeatures4scene15",...
+%     "CMUPIE_random_256",...
+%     "ucf50_dataset",...
+    "hmdb51_dataset"
+];
+
+samples_numbers = [
+%     yaleb_number,...
+%     ar_face_number,...
+%     caltech_101_number,...
+%     scene_15_number,...
+%     cmupie_number,...
+%     ucf50_number,...
+    hmdb51_number
+];
+
+for i_DataPath = 1:length(DataPaths)
+    DataPath = DataPaths(i_DataPath);
+    number = samples_numbers(i_DataPath);
+    load(fullfile('./dbs', DataPath));
+    
+    % Choose parameters by db
+    switch DataPath
+        case 'randomfaces4extendedyaleb'
+            gamma = 0.005;
+        case 'randomfaces4ar'
+            gamma = 5000;
+        case 'spatialpyramidfeatures4caltech101'   
+            gamma = 0.005;
+        case 'ucf50_dataset'
+            gamma = 5;
+        case 'CMUPIE_random_256'
+            gamma = 5;
+        case 'spatialpyramidfeatures4scene15'
+            gamma = 0.005;
+        case 'hmdb51_dataset'
+            gamma = 50;
+    end
+    
+    T_Accuracy = 0;
+    T_TrTime = 0;
+    T_TtTime = 0;
+    
+    for i_round = 1:n_rounds
+        [TrData, TtData, TrLabel, TtLabel] = extract_data(featureMat, labelMat,number, i_round);
+
+        % Data preprocessing
+        y_train = TrLabel;
+        y_test = TtLabel;
+        X_train = TrData;
+        X_test = TtData;
+
+        [accuracy, trainTime, testTime, D_all] = aksvd_idl(X_train, y_train,...
+          X_test, y_test, gamma, n_components, n_nonzero_coefs, n_iterations, i_round);
+
+        fprintf('[IDL] Accuracy: %f\n', accuracy);
+        fprintf('[IDL] Training time: %f [sec]\n', trainTime);
+        fprintf('[IDL] Testing time: %f [sec]\n', testTime);
+        fprintf('\n');
+        
+        fprintf(file, '[IDL] Accuracy: %f\n', accuracy);
+        fprintf(file, '[IDL] Training time: %f [sec]\n', trainTime);
+        fprintf(file, '[IDL] Testing time: %f [sec]\n', testTime);
+        fprintf(file, '\n');
+        
+        T_Accuracy = T_Accuracy + accuracy;
+        T_TrTime = T_TrTime + trainTime;
+        T_TtTime = T_TtTime + testTime;
+    end
+    
+    T_Accuracy = T_Accuracy / n_rounds;
+    T_TrTime = T_TrTime / n_rounds;
+    T_TtTime = T_TtTime / n_rounds;
+    
+    fprintf(file, '\n%s\n', DataPath);
+    fprintf(file, 'The running time for IDL training is : %.03f\n', T_TrTime);
+    fprintf(file, 'The running time for IDL testing is : %.03f\n', T_TtTime);
+    fprintf(file, 'Recognition rate for IDL is : %.04f\n', T_Accuracy);
+    fprintf(file, '\n\n');
+    fprintf(file, '======================================================');
+    fprintf(file, '\n\n');
+end
+
+fclose(file);
\ No newline at end of file
diff --git a/aksvd_itdb.m b/aksvd_itdb.m
new file mode 100644
index 0000000000000000000000000000000000000000..c940ad49f4d9cafaf571b7a7c7cd32f758642980
--- /dev/null
+++ b/aksvd_itdb.m
@@ -0,0 +1,133 @@
+function [accuracy, trainTime, testTime, D_all] = aksvd_itdb(X_train, y_train,...
+    X_test, y_test, n_components, n_nonzero_coefs, n_iterations, M, gamma,...
+    triplet_prec, seed)
+
+    % Prepare seed
+    rng(seed)
+
+    % Dataset properties
+    n_classes = length(unique(y_train));
+    n_features = size(X_train, 1);
+
+    X_all = cell(1, n_classes);
+    D_all = cell(1, n_classes);
+    for i_class = 1:n_classes
+        D_all{i_class} = normcol_equal(randn(size(X_train,1), n_components));
+    end
+
+
+    % Start waitbar
+    trainTime = 0;
+    wb = waitbar(0, '[ITDB] Training...');
+
+    for i_iter = 1:n_iterations
+        tmpTime = tic;
+        for i_class = 1:n_classes
+            % Coding method
+            X_all{i_class} = omp(X_train(:, y_train==i_class), D_all{i_class}, n_nonzero_coefs);
+
+            % Learning method
+            Y = X_train(:, y_train==i_class);
+            D = D_all{i_class};
+            X = X_all{i_class};
+            E = Y - D * X;
+
+            Dc = [];
+            for tmp_i_class = 1:n_classes
+               if tmp_i_class ~= i_class
+                  Dc = [Dc D_all{tmp_i_class}];
+               end
+            end
+
+            for j = 1:n_components
+                [~, data_indices, x] = find(X(j,:));
+
+                if (isempty(data_indices))
+                    d = randn(n_features, 1);
+                    D(:, j) = d / norm(d);
+                else
+                    d = D(:, j);
+                    rp = randperm(size(Dc, 2));
+                    rp = rp(1:length(rp) * triplet_prec);
+                    grad = compute_triplet_barrier_grad(d, D, Dc(:, rp), M);
+
+                    F = E(:, data_indices) + d * x;
+                    d = F*x' - gamma*grad;
+
+                    D(:, j) = d / norm(d);
+                    X(j, data_indices) = F'*D(:, j);
+                    E(:, data_indices) = F - D(:, j)*X(j, data_indices);
+                end
+            end
+
+            D_all{i_class} = D;
+            X_all{i_class} = X;
+        end
+        trainTime = trainTime + toc(tmpTime);
+
+        % Update waitbar
+        waitbar(i_iter/n_iterations, wb, sprintf('[ITDB] Training - Remaining time: %d [sec]',...
+                round(trainTime/i_iter*(n_iterations - i_iter))));
+
+
+
+
+%         if mod(i_iter, 5) == 0
+%             Errs = [];
+%             prediction = [];
+%             for i_test = 1:size(X_test, 2)
+%                 errs = [];
+%                 for i_class = 1:n_classes
+%                     x = omp(X_test(:, i_test), D_all{i_class}, n_nonzero_coefs);
+%                     errs = [errs norm(X_test(:, i_test) - D_all{i_class} * x)];
+%                 end
+% 
+%                 Errs = [Errs; errs];
+%                 [~, index] = min(errs);
+%                 prediction = [prediction index];
+%             end
+% 
+%             % Compute problem accuracy
+%             accuracy = sum(y_test==prediction)/size(X_test,2);
+%             fprintf("[ITDB] Accuracy round %d: %f\n", i_iter, accuracy);
+%             fprintf('\n');
+%         end
+    end
+
+    % Close waitbar
+    close(wb);
+
+
+    % Start waitbar
+    testTime = 0;
+    wb = waitbar(0, '[ITDB] Testing...');
+
+    Errs = [];
+    prediction = [];
+    for i_test = 1:size(X_test, 2)
+
+        tmpTime = tic;
+        errs = [];
+        for i_class = 1:n_classes
+            x = omp(X_test(:, i_test), D_all{i_class}, n_nonzero_coefs);
+            errs = [errs norm(X_test(:, i_test) - D_all{i_class} * x)];
+        end
+
+        Errs = [Errs; errs];
+        [~, index] = min(errs);
+        prediction = [prediction index];
+        testTime = testTime + toc(tmpTime);
+
+       % Update waitbar
+       waitbar(i_test/size(X_test, 2), wb, sprintf('[ITDB] Testing - Remaining time: %d [sec]',...
+               round(testTime/i_test*(size(X_test, 2) - i_test))));
+    end
+
+    % Close waitbar
+    close(wb);
+
+
+    % Compute problem accuracy
+    accuracy = sum(y_test==prediction)/size(X_test,2);
+
+end
diff --git a/aksvd_itdb_test.m b/aksvd_itdb_test.m
new file mode 100644
index 0000000000000000000000000000000000000000..4146c7dd67461f4502bf137611d7859af55d9aea
--- /dev/null
+++ b/aksvd_itdb_test.m
@@ -0,0 +1,48 @@
+clc;
+clear;
+close all;
+
+n_components = 40;
+n_nonzero_coefs = 20;
+n_iterations = 10;
+triplet_prec = 0.05;
+
+M = 1.2;                % barrier margin
+gamma = 0.5;            % trade-off factors
+
+rng_seed = 1;
+yaleb_number = 32;
+ar_face_number = 20;
+caltech_101_number = 30;
+scene_15_number = 30;
+cmupie_number = 30;
+ucf50_number = 30;
+hmdb51_number = 30;
+
+DataPath = 'randomfaces4extendedyaleb'; number = yaleb_number;
+% DataPath = 'randomfaces4ar'; number = ar_face_number;
+% DataPath = 'spatialpyramidfeatures4caltech101'; number = caltech_101_number;
+% DataPath = 'spatialpyramidfeatures4scene15'; number = scene_15_number;
+% DataPath = 'CMUPIE_random_256'; number = cmupie_number;
+% DataPath = 'ucf50_dataset'; number = ucf50_number;
+% DataPath = 'hmdb51_dataset'; number = hmdb51_number;
+load(fullfile('dbs', DataPath));
+
+[TrData, TtData, TrLabel, TtLabel] = extract_data(featureMat, labelMat,...
+                                                  number, rng_seed);
+
+% Data preprocessing
+y_train = TrLabel;
+y_test = TtLabel;
+X_train = TrData;
+X_test = TtData;
+
+[accuracy, trainTime, testTime, D_all] = aksvd_itdb(X_train, y_train,...
+    X_test, y_test, n_components, n_nonzero_coefs, n_iterations, M, gamma,...
+    triplet_prec, 'default');
+
+fprintf("[ITDB] Accuracy: %f\n", accuracy);
+fprintf('[ITDB] Training time: %f [sec]\n', trainTime);
+fprintf('[ITDB] Testing time: %f [sec]\n', testTime);
+fprintf('\n');
+
diff --git a/aksvd_itdb_test_n_rounds.m b/aksvd_itdb_test_n_rounds.m
new file mode 100644
index 0000000000000000000000000000000000000000..3f1399501f6a68c2db1afd001612a8c49da088a8
--- /dev/null
+++ b/aksvd_itdb_test_n_rounds.m
@@ -0,0 +1,121 @@
+clc;
+clear;
+close all;
+
+n_rounds = 10;
+n_components = 40;
+n_nonzero_coefs = 20;
+n_iterations = 10;
+file = fopen('logs_itdb.txt', 'a');
+
+yaleb_number = 32;
+ar_face_number = 20;
+caltech_101_number = 30;
+scene_15_number = 30;
+cmupie_number = 30;
+ucf50_number = 30;
+hmdb51_number = 30;
+
+DataPaths = [
+%     "randomfaces4extendedyaleb",...
+%     "randomfaces4ar",...
+%     "spatialpyramidfeatures4caltech101",...
+%     "spatialpyramidfeatures4scene15",...
+%     "CMUPIE_random_256",...
+%     "ucf50_dataset",...
+    "hmdb51_dataset"
+];
+
+samples_numbers = [
+%     yaleb_number,...
+%     ar_face_number,...
+%     caltech_101_number,...
+%     scene_15_number,...
+%     cmupie_number,...
+%     ucf50_number,...
+    hmdb51_number
+];
+
+for i_DataPath = 1:length(DataPaths)
+    DataPath = DataPaths(i_DataPath);
+    number = samples_numbers(i_DataPath);
+    load(fullfile('./dbs', DataPath));
+    
+    % Choose parameters by db
+    switch DataPath
+        case 'randomfaces4extendedyaleb'
+            M = 1.8;
+            triplet_prec = 0.1;
+            gamma = 500;
+        case 'randomfaces4ar'
+            M = 1.4;
+            triplet_prec = 0.1;
+            gamma = 500;
+        case 'spatialpyramidfeatures4caltech101'   
+            M = 1.4;
+            triplet_prec = 0.05;
+            gamma = 0.005;
+        case 'ucf50_dataset'
+            M = 1.6;
+            triplet_prec = 0.05;
+            gamma = 5000;
+        case 'CMUPIE_random_256'
+            M = 1.4;
+            triplet_prec = 0.05;
+            gamma = 500;
+        case 'spatialpyramidfeatures4scene15'
+            M = 1.4;
+            triplet_prec = 0.05;
+            gamma = 0.00005;
+        case 'hmdb51_dataset'
+            M = 1.6;
+            triplet_prec = 0.05;
+            gamma = 5000;
+    end
+    
+    T_Accuracy = 0;
+    T_TrTime = 0;
+    T_TtTime = 0;
+    
+    for i_round = 1:n_rounds
+        [TrData, TtData, TrLabel, TtLabel] = extract_data(featureMat, labelMat,number, i_round);
+
+        % Data preprocessing
+        y_train = TrLabel;
+        y_test = TtLabel;
+        X_train = TrData;
+        X_test = TtData;
+
+        [accuracy, trainTime, testTime, D_all] = aksvd_itdb(X_train, y_train,...
+            X_test, y_test, n_components, n_nonzero_coefs, n_iterations, M, gamma,...
+            triplet_prec, i_round);
+
+        fprintf('[ITDB] Accuracy: %f\n', accuracy);
+        fprintf('[ITDB] Training time: %f [sec]\n', trainTime);
+        fprintf('[ITDB] Testing time: %f [sec]\n', testTime);
+        fprintf('\n');
+
+        fprintf(file, '[ITDB] Accuracy: %f\n', accuracy);
+        fprintf(file, '[ITDB] Training time: %f [sec]\n', trainTime);
+        fprintf(file, '[ITDB] Testing time: %f [sec]\n', testTime);
+        fprintf(file, '\n');
+
+        T_Accuracy = T_Accuracy + accuracy;
+        T_TrTime = T_TrTime + trainTime;
+        T_TtTime = T_TtTime + testTime;
+    end
+
+    T_Accuracy = T_Accuracy / n_rounds;
+    T_TrTime = T_TrTime / n_rounds;
+    T_TtTime = T_TtTime / n_rounds;
+
+    fprintf(file, '\n%s\n', DataPath);
+    fprintf(file, 'The running time for ITDB training is : %.03f\n', T_TrTime);
+    fprintf(file, 'The running time for ITDB testing is : %.03f\n', T_TtTime);
+    fprintf(file, 'Recognition rate for ITDB is : %.04f\n', T_Accuracy);
+    fprintf(file, '\n\n');
+    fprintf(file, '======================================================');
+    fprintf(file, '\n\n');
+end
+
+fclose(file);
\ No newline at end of file
diff --git a/compute_barrier_grad.m b/compute_barrier_grad.m
new file mode 100644
index 0000000000000000000000000000000000000000..3462bae3e3def562c38a86a0d06d7dab71dc4c69
--- /dev/null
+++ b/compute_barrier_grad.m
@@ -0,0 +1,20 @@
+function [g1, g2] = compute_barrier_grad(d, Dc, desired_coh)
+    n_features = length(d);
+    v = Dc'*d;
+    w = max(abs(v)/desired_coh, 1);
+
+    % what atoms are too close?
+    i_minus = find(v>desired_coh);
+    i_plus = find(v<-desired_coh);
+
+    % gradients
+    g1 = Dc*(w.*v);
+    g2 = zeros(n_features, 1);
+    if ~isempty(i_minus)
+        g2 = g2 + sum(Dc(:,i_minus) - repmat(d,1,length(i_minus)), 2);
+    end
+    if ~isempty(i_plus)
+        g2 = g2 - sum(Dc(:,i_plus) + repmat(d,1,length(i_plus)), 2);
+    end
+end
+
diff --git a/compute_triplet_barrier_grad.m b/compute_triplet_barrier_grad.m
new file mode 100644
index 0000000000000000000000000000000000000000..ea85634459879e807b23fea4850fe4440b1c8bcb
--- /dev/null
+++ b/compute_triplet_barrier_grad.m
@@ -0,0 +1,22 @@
+function [grad] = compute_triplet_barrier_grad(d, D, Dc, M)
+    sign_D = sign(d' * D);
+    diff_D = d - sign_D .* D;
+    l2_diff_D = vecnorm(diff_D).^2;
+
+    sign_Dc = sign(d' * Dc);
+    diff_Dc = d - sign_Dc .* Dc;
+    l2_diff_Dc = vecnorm(diff_Dc).^2;
+
+    r_diff_D = repelem(diff_D, 1, length(l2_diff_Dc));
+    r_l2_diff_D = repelem(l2_diff_D, 1, length(l2_diff_Dc));
+    r_diff_Dc = repmat(diff_Dc, 1, length(l2_diff_D));
+    r_l2_diff_Dc = repmat(l2_diff_Dc, 1, length(l2_diff_D));
+
+    v = max(0, M - r_l2_diff_D + r_l2_diff_Dc);
+    index = find(v == 0);
+    r_diff_D(:, index) = 0;
+    r_diff_Dc(:, index) = 0;
+
+    grad = sum(r_diff_D - r_diff_Dc, 2);
+end
+
diff --git a/dbs/CMUPIE_random_256.mat b/dbs/CMUPIE_random_256.mat
new file mode 100644
index 0000000000000000000000000000000000000000..a3ef91f3de869c777d1e0f1945511ca3b07223f8
Binary files /dev/null and b/dbs/CMUPIE_random_256.mat differ
diff --git a/extract_data.m b/extract_data.m
new file mode 100644
index 0000000000000000000000000000000000000000..684f9ab61046995547043d7a0cfd6af9bf2aa2fb
--- /dev/null
+++ b/extract_data.m
@@ -0,0 +1,29 @@
+function [TrData, TtData, TrLabel, TtLabel] = extract_data(featureMat, labelMat, number, rng_seed)
+    rng(rng_seed, 'twister');
+    n_classes = size(labelMat, 1);
+    n_features = size(featureMat, 1);
+
+    TrData = zeros(n_features, number*n_classes);
+    TtData = zeros(n_features, size(featureMat, 2) - number*n_classes);
+    TrLabel = zeros(1, number*n_classes);
+    TtLabel = zeros(1, size(featureMat, 2) - number*n_classes);
+    
+    i_tr = 1;
+    i_tt = 1;
+
+    for i_class = 1:n_classes
+        TempData = featureMat(:, labelMat(i_class, :) == 1);
+        TempLabel = i_class * ones(1, size(TempData, 2));
+
+        l = length(TempLabel);
+        index = randperm(l);
+
+        TrData(:, i_tr:(i_tr+number-1)) = TempData(:, index(1:number));
+        TrLabel(:,i_tr:(i_tr+number-1)) = TempLabel(:, index(1:number));
+        TtData(:, i_tt:(i_tt+(l-number-1))) = TempData(:, index((number+1):end));
+        TtLabel(:, i_tt:(i_tt+(l-number-1))) = TempLabel(:, index((number+1):end));
+
+        i_tr = i_tr + number;
+        i_tt = i_tt + l - number;
+    end
+end
\ No newline at end of file
diff --git a/extractors/hmdb51_extractor.m b/extractors/hmdb51_extractor.m
new file mode 100644
index 0000000000000000000000000000000000000000..7397364164f299a7035616d71e4bc19628c8d356
--- /dev/null
+++ b/extractors/hmdb51_extractor.m
@@ -0,0 +1,39 @@
+% This script is used to extract features from the
+% original repository of action bank from Buffalo University.
+
+clc;
+clear;
+
+k = 5000; % number of pca features
+dirFolders = dir('../data/ab_hmdb51_e2f1g2_matlab');
+
+position = 1;
+label = 1;
+db_features = zeros(14965, 6766);
+
+for i = 1:length(dirFolders)
+   if ~strcmp(dirFolders(i).name, '.') && ~strcmp(dirFolders(i).name, '..') && dirFolders(i).isdir
+       files = dir(strcat(dirFolders(i).folder, '/', dirFolders(i).name, '/*.mat'));
+       
+       for j = 1:length(files)
+            tmp_data = load(strcat(files(j).folder, '/', files(j).name));
+            db_features(:, position) = tmp_data.v;
+            db_labels(position) = label;
+            position = position + 1;
+       end
+       
+       label = label + 1;
+   end
+end
+
+[coeff, ~, ~] = pca(db_features');
+featureMat = (db_features' * coeff(:, 1:k))';
+n_samples = size(db_features, 2);
+n_classes = length(unique(db_labels));
+labelMat = zeros(n_classes, n_samples);
+
+for i = 1:n_samples
+    labelMat(db_labels(i), i) = 1;
+end
+
+save('hmdb51_dataset.mat', 'featureMat', 'labelMat');
\ No newline at end of file
diff --git a/extractors/ucf50_extractor.m b/extractors/ucf50_extractor.m
new file mode 100755
index 0000000000000000000000000000000000000000..9ce676a4b7db611fd4f680ad7a8f5efb1da04178
--- /dev/null
+++ b/extractors/ucf50_extractor.m
@@ -0,0 +1,39 @@
+% This script is used to extract features from the
+% original repository of action bank from Buffalo University.
+
+clc;
+clear;
+
+k = 5000; % number of pca features
+dirFolders = dir('../data/ab_ucf50_matlab');
+
+position = 1;
+label = 1;
+db_features = zeros(14965, 6680);
+
+for i = 1:length(dirFolders)
+   if ~strcmp(dirFolders(i).name, '.') && ~strcmp(dirFolders(i).name, '..') && dirFolders(i).isdir
+       files = dir(strcat(dirFolders(i).folder, '/', dirFolders(i).name, '/*.mat'));
+       
+       for j = 1:length(files)
+            tmp_data = load(strcat(files(j).folder, '/', files(j).name));
+            db_features(:, position) = tmp_data.v;
+            db_labels(position) = label;
+            position = position + 1;
+       end
+       
+       label = label + 1;
+   end
+end
+
+[coeff, ~, ~] = pca(db_features');
+featureMat = (db_features' * coeff(:, 1:k))';
+n_samples = size(db_features, 2);
+n_classes = length(unique(db_labels));
+labelMat = zeros(n_classes, n_samples);
+
+for i = 1:n_samples
+    labelMat(db_labels(i), i) = 1;
+end
+
+save('ucf50_dataset.mat', 'featureMat', 'labelMat');
\ No newline at end of file
diff --git a/hyper_parameters_tunning.m b/hyper_parameters_tunning.m
new file mode 100644
index 0000000000000000000000000000000000000000..4cb4f7e7f801e92660fe329000494583460da3a6
--- /dev/null
+++ b/hyper_parameters_tunning.m
@@ -0,0 +1,170 @@
+clc;
+clear;
+close all;
+
+n_rounds = 1;
+n_components_list = [40];
+n_nonzero_coefs_prec_list = [0.5];
+n_iterations_list = [10];
+M_list = [1.2, 1.4, 1.6, 1.8];
+gamma_list = [5e-8 5e-7 5e-6];% 5 5e+1 5e+2 5e+3];
+lambda_list = [5e-5 5e-4 5e-3 5e-2 5e-1];% 5 5e+1 5e+2 5e+3];
+triplet_prec_list = [0.05];
+
+% this are used only for discrim dl
+% alpha_list = logspace(-2, 2, 10);
+% n_list = [1000];
+
+rng_seed = 1;
+yaleb_number = 32;
+ar_face_number = 20;
+caltech_101_number = 30;
+scene_15_number = 30;
+cmupie_number = 30;
+ucf50_number = 30;
+hmdb51_number = 30;
+
+% DataPath = 'randomfaces4extendedyaleb'; number = yaleb_number;
+% DataPath = 'randomfaces4ar'; number = ar_face_number;
+DataPath = 'spatialpyramidfeatures4caltech101'; number = caltech_101_number;
+% DataPath = 'spatialpyramidfeatures4scene15'; number = scene_15_number;
+% DataPath = 'CMUPIE_random_256'; number = cmupie_number;
+% DataPath = 'ucf50_dataset'; number = ucf50_number;
+% DataPath = 'hmdb51_dataset'; number = hmdb51_number;
+load(fullfile('dbs', DataPath));
+
+[TrData, TtData, TrLabel, TtLabel] = extract_data(featureMat, labelMat,...
+                                                  number, rng_seed);
+
+% Data preprocessing
+y_train = TrLabel;
+y_test = TtLabel;
+X_train = TrData;
+X_test = TtData;
+
+
+% AK-SVD IDL
+fileID = fopen(sprintf('./logs/aksvd_%s_idl.log', DataPath),'a');
+
+for seed = 1:n_rounds
+    for i_n_components = 1:length(n_components_list)
+        n_components = n_components_list(i_n_components);
+
+        for i_n_nonzero_coefs_prec = 1:length(n_nonzero_coefs_prec_list)
+            n_nonzero_coefs_prec = n_nonzero_coefs_prec_list(i_n_nonzero_coefs_prec);
+            n_nonzero_coefs = round(n_nonzero_coefs_prec * n_components);
+
+            for i_n_iterations = 1:length(n_iterations_list)
+                n_iterations = n_iterations_list(i_n_iterations);
+
+                for i_gamma = 1:length(gamma_list)
+                    gamma = gamma_list(i_gamma);
+
+                    [accuracy, trainTime, testTime, D_all] = aksvd_idl(X_train, y_train,...
+                        X_test, y_test, gamma, n_components, n_nonzero_coefs, n_iterations, seed);
+
+                    mat_file = sprintf('idl_n_components_%d_n_nonzero_coefs_%d_n_iterations_%d_gamma_%d.mat',...
+                                       n_components, n_nonzero_coefs, n_iterations, gamma);
+%                     save(['./mats/' mat_file], 'DataPath', 'accuracy', 'trainTime', 'testTime', 'D_all',...
+%                          'n_components', 'n_nonzero_coefs', 'n_iterations', 'gamma');
+
+                    fprintf(fileID, '%s: %f\n', mat_file, accuracy);
+                end
+            end
+        end 
+    end
+end
+    
+fclose(fileID);
+
+
+
+
+% AK-SVD IDB
+fileID = fopen(sprintf('./logs/aksvd_%s_idb.log', DataPath),'a');
+
+for seed = 1:n_rounds
+    for i_n_components = 1:length(n_components_list)
+        n_components = n_components_list(i_n_components);
+
+        for i_n_nonzero_coefs_prec = 1:length(n_nonzero_coefs_prec_list)
+            n_nonzero_coefs_prec = n_nonzero_coefs_prec_list(i_n_nonzero_coefs_prec);
+            n_nonzero_coefs = round(n_nonzero_coefs_prec * n_components);
+
+            for i_n_iterations = 1:length(n_iterations_list)
+                n_iterations = n_iterations_list(i_n_iterations);
+
+                for i_M = 1:length(M_list)
+                    M = M_list(i_M);
+                
+                    for i_gamma = 1:length(gamma_list)
+                        gamma = gamma_list(i_gamma);
+
+                        for i_lambda = 1:length(lambda_list)
+                            lambda = lambda_list(i_lambda);
+
+                            [accuracy, trainTime, testTime, D_all] = aksvd_idb(X_train, y_train,...
+                                X_test, y_test, n_components, n_nonzero_coefs, n_iterations, M, gamma,...
+                                lambda, seed);
+
+                            mat_file = sprintf('idb_n_components_%d_n_nonzero_coefs_%d_n_iterations_%d_gamma_%f_lambda_%f_M_%d.mat',...
+                                               n_components, n_nonzero_coefs, n_iterations, gamma, lambda, M);
+%                             save(['./mats/' mat_file], 'DataPath', 'accuracy', 'trainTime', 'testTime', 'D_all',...
+%                                  'n_components', 'n_nonzero_coefs', 'n_iterations', 'gamma', 'lambda', 'M');
+
+                            fprintf(fileID, '%s: %f\n', mat_file, accuracy);
+                        end
+                    end
+                end
+            end
+        end 
+    end
+end
+
+fclose(fileID);
+
+
+
+
+% AK-SVD ITDB
+fileID = fopen(sprintf('./logs/aksvd_%s_itdb.log', DataPath),'a');
+
+for seed = 1:n_rounds
+    for i_n_components = 1:length(n_components_list)
+        n_components = n_components_list(i_n_components);
+
+        for i_n_nonzero_coefs_prec = 1:length(n_nonzero_coefs_prec_list)
+            n_nonzero_coefs_prec = n_nonzero_coefs_prec_list(i_n_nonzero_coefs_prec);
+            n_nonzero_coefs = round(n_nonzero_coefs_prec * n_components);
+
+            for i_n_iterations = 1:length(n_iterations_list)
+                n_iterations = n_iterations_list(i_n_iterations);
+
+                for i_M = 1:length(M_list)
+                    M = M_list(i_M);
+                
+                    for i_gamma = 1:length(gamma_list)
+                        gamma = gamma_list(i_gamma);
+
+                        for i_triple_prec = 1:length(triplet_prec_list)
+                            triplet_prec = triplet_prec_list(i_triple_prec);
+
+                            [accuracy, trainTime, testTime, D_all] = aksvd_itdb(X_train, y_train,...
+                                X_test, y_test, n_components, n_nonzero_coefs, n_iterations, M, gamma,...
+                                triplet_prec, seed);
+
+                            mat_file = sprintf('itdb_n_components_%d_n_nonzero_coefs_%d_n_iterations_%d_gamma_%f_triplet_prec_%f_M_%d.mat',...
+                                               n_components, n_nonzero_coefs, n_iterations, gamma, triplet_prec, M);
+%                             save(['./mats/' mat_file], 'DataPath', 'accuracy', 'trainTime', 'testTime', 'D_all',...
+%                                  'n_components', 'n_nonzero_coefs', 'n_iterations', 'gamma', 'triplet_prec', 'M');
+
+                            fprintf(fileID, '%s: %f\n', mat_file, accuracy);
+                        end
+                    end
+                end
+            end
+        end 
+    end
+end
+
+fclose(fileID);
diff --git a/normcol_equal.m b/normcol_equal.m
new file mode 100644
index 0000000000000000000000000000000000000000..ada53c74b9952aa1f6a50bb5f33c3688cb28602f
--- /dev/null
+++ b/normcol_equal.m
@@ -0,0 +1,7 @@
+function [matout]=normcol_equal(matin)
+% solve the proximal problem 
+% matout = argmin||matout-matin||_F^2, s.t. matout(:,i)=1
+    matout = matin./repmat(sqrt(sum(matin.*matin,1)+eps),size(matin,1),1);
+end
+
+
diff --git a/omp.m b/omp.m
new file mode 100644
index 0000000000000000000000000000000000000000..eb7cb33b6eb52bd1dc4ad9d88784a228d8b68b22
--- /dev/null
+++ b/omp.m
@@ -0,0 +1,47 @@
+% Copyright (c) 2016 Paul Irofti <paul@irofti.net>
+% 
+% Permission to use, copy, modify, and/or distribute this software for any
+% purpose with or without fee is hereby granted, provided that the above
+% copyright notice and this permission notice appear in all copies.
+% 
+% THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+% WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+% MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+% ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+% WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+% ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+% OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+
+function [X,shared] = omp(Y,D,s,shared,varargin)
+%% Orthogonal Matching Pursuit algorithm
+% INPUTS:
+%   Y -- training signals set
+%   D -- current dictionary
+%   s -- sparsity constraint
+%
+% OUTPUTS:
+%   X -- sparse representations
+
+    if isempty(varargin)
+        ompparams = {'checkdict', 'off'};
+        do_sparse = true;
+    else
+        if strcmp(varargin{1}, 'error')
+            do_sparse = false;
+            error = varargin{2};
+            if length(varargin) > 2
+                ompparams = {varargin{3:end}};
+            else
+                ompparams = {'checkdict', 'off'};
+            end
+        else
+            do_sparse = true;
+            ompparams = varargin;
+        end
+    end
+    if do_sparse
+        X = omp_sparse(D'*Y, D'*D, s, ompparams{:});
+    else
+        X = omp_err(D'*Y, sum(Y.*Y), D'*D, error, ompparams{:});
+    end
+end
diff --git a/omp_err.m b/omp_err.m
new file mode 100644
index 0000000000000000000000000000000000000000..941a473e6c2f21ad32cf4b4bb237644b81ba059c
--- /dev/null
+++ b/omp_err.m
@@ -0,0 +1,205 @@
+function gamma = omp_err(varargin)
+%OMP2 Error-constrained Orthogonal Matching Pursuit.
+%  GAMMA = OMP2(D,X,G,EPSILON) solves the optimization problem
+%
+%       min  |GAMMA|_0     s.t.  |X - D*GAMMA|_2 <= EPSILON
+%      gamma
+%
+%  for each of the signals in X, using Batch Orthogonal Matching Pursuit.
+%  Here, D is a dictionary with normalized columns, X is a matrix
+%  containing column signals, EPSILON is the error target for each signal,
+%  and G is the Gramm matrix D'*D. The output GAMMA is a matrix containing
+%  the sparse representations as its columns. 
+%
+%  GAMMA = OMP2(D,X,[],EPSILON) performs the same operation, but without
+%  the matrix G, using OMP-Cholesky. This call produces the same output as
+%  Batch-OMP, but is significantly slower. Using this syntax is only
+%  recommended when available memory is too small to store G.
+%
+%  GAMMA = OMP2(DtX,XtX,G,EPSILON) is the fastest implementation of OMP2,
+%  but also requires the most memory. Here, DtX stores the projections
+%  D'*X, and XtX is a row vector containing the squared norms of the
+%  signals, sum(X.*X). In this case Batch-OMP is used, but without having
+%  to compute D'*X and XtX in advance, which slightly improves runtime.
+%  Note that in general, the call
+%
+%    GAMMA = OMP2(D'*X, sum(X.*X), G, EPSILON);
+%
+%  will be faster than the call
+%
+%    GAMMA = OMP2(D,X,G,EPSILON);
+%
+%  due to optimized matrix multiplications in Matlab. However, when the
+%  entire matrix D'*X cannot be stored in memory, one of the other two
+%  versions can be used. Both compute D'*X for just one signal at a time,
+%  and thus require much less memory.
+%
+%  GAMMA = OMP2(...,PARAM1,VAL1,PARAM2,VAL2,...) specifies additional
+%  parameters for OMP2. Available parameters are:
+%
+%    'gammamode' - Specifies the representation mode for GAMMA. Can be
+%                  either 'full' or 'sparse', corresponding to a full or
+%                  sparse matrix, respectively. By default, GAMMA is
+%                  returned as a sparse matrix.
+%    'maxatoms' -  Limits the number of atoms in the representation of each
+%                  signal. If specified, the number of atoms in each
+%                  representation does not exceed this number, even if the
+%                  error target is not met. Specifying maxatoms<0 implies
+%                  no limit (default).
+%    'messages'  - Specifies whether progress messages should be displayed.
+%                  When positive, this is the number of seconds between
+%                  status prints. When negative, indicates that no messages
+%                  should be displayed (this is the default).
+%    'checkdict' - Specifies whether dictionary normalization should be
+%                  verified. When set to 'on' (default) the dictionary
+%                  atoms are verified to be of unit L2-norm. Setting this
+%                  parameter to 'off' disables verification and accelerates
+%                  function performance. Note that an unnormalized
+%                  dictionary will produce invalid results.
+%    'profile'   - Can be either 'on' or 'off'. When 'on', profiling
+%                  information is displayed at the end of the funciton
+%                  execution.
+%
+%
+%  Summary of OMP2 versions:
+%
+%    version                 |   speed     |   memory
+%  -------------------------------------------------------------
+%   OMP2(DtX,XtX,G,EPSILON)  |  very fast  |  very large
+%   OMP2(D,X,G,EPSILON)      |  fast       |  moderate
+%   OMP2(D,X,[],EPSILON)     |  very slow  |  small
+%  -------------------------------------------------------------
+%
+%
+%  References:
+%  [1] M. Elad, R. Rubinstein, and M. Zibulevsky, "Efficient Implementation
+%      of the K-SVD Algorithm using Batch Orthogonal Matching Pursuit",
+%      Technical Report - CS, Technion, April 2008.
+%
+%  See also OMP.
+
+
+%  Ron Rubinstein
+%  Computer Science Department
+%  Technion, Haifa 32000 Israel
+%  ronrubin@cs
+%
+%  April 2009
+
+
+% default options
+
+sparse_gamma = 1;
+msgdelta = -1;
+maxatoms = -1;
+checkdict = 1;
+profile = 0;
+
+
+% determine number of parameters
+
+paramnum = 1;
+while (paramnum<=nargin && ~ischar(varargin{paramnum}))
+  paramnum = paramnum+1;
+end
+paramnum = paramnum-1;
+
+
+% parse options
+
+for i = paramnum+1:2:length(varargin)
+  paramname = varargin{i};
+  paramval = varargin{i+1};
+
+  switch lower(paramname)
+
+    case 'gammamode'
+      if (strcmpi(paramval,'sparse'))
+        sparse_gamma = 1;
+      elseif (strcmpi(paramval,'full'))
+        sparse_gamma = 0;
+      else
+        error('Invalid GAMMA mode');
+      end
+      
+    case 'maxatoms'
+      maxatoms = paramval;
+      
+    case 'messages'
+      msgdelta = paramval;
+
+    case 'checkdict'
+      if (strcmpi(paramval,'on'))
+        checkdict = 1;
+      elseif (strcmpi(paramval,'off'))
+        checkdict = 0;
+      else
+        error('Invalid checkdict option');
+      end
+
+    case 'profile'
+      if (strcmpi(paramval,'on'))
+        profile = 1;
+      elseif (strcmpi(paramval,'off'))
+        profile = 0;
+      else
+        error('Invalid profile mode');
+      end
+
+    otherwise
+      error(['Unknown option: ' paramname]);
+  end
+  
+end
+
+
+% determine call type
+
+if (paramnum==4)
+  
+  n1 = size(varargin{1},1);
+  n2 = size(varargin{2},1);
+  n3 = size(varargin{3},1);
+  
+  if ( (n1>1 && n2==1) || (n1==1 && n2==1 && n3==1) )  %  DtX,XtX,G,EPSILON
+    
+    DtX = varargin{1};
+    XtX = varargin{2};
+    G = varargin{3};
+    epsilon = varargin{4};
+    D = [];
+    X = [];
+    
+  else  % D,X,G,EPSILON
+    
+    D = varargin{1};
+    X = varargin{2};
+    G = varargin{3};
+    epsilon = varargin{4};
+    DtX = [];
+    XtX = [];
+    
+  end
+  
+else
+  error('Invalid number of parameters');
+end
+
+
+% verify dictionary normalization
+
+if (checkdict)
+  if (isempty(G))
+    atomnorms = sum(D.*D);
+  else
+    atomnorms = diag(G);
+  end
+  if (any(abs(atomnorms-1) > 1e-2))
+    error('Dictionary columns must be normalized to unit length');
+  end
+end
+
+
+% omp
+
+gamma = omp2mex(D,X,DtX,XtX,G,epsilon,sparse_gamma,msgdelta,maxatoms,profile);
diff --git a/omp_ker.m b/omp_ker.m
new file mode 100644
index 0000000000000000000000000000000000000000..2d0c9d118199bbf53887052eb94112572821ddb7
--- /dev/null
+++ b/omp_ker.m
@@ -0,0 +1,40 @@
+function [X, shared] = omp_ker(K, Kz, A, s, shared, varargin)
+% Kernel Orthogonal Matching Pursuit algorithm
+% Can be used inside a DL algorithm or on its own
+% Input:
+%   K       - kernel matrix
+%   Kz      - kernel matrix between training and test signals (in DL, Kz==K)
+%             if empty, then Kz = K
+%   A       - current dictionary
+%   s       - sparsity constraint
+%
+% Output:
+%   X       - sparse representations
+
+
+    if isempty(varargin)
+        ompparams = {'checkdict', 'off'};
+        do_sparse = true;
+    else
+        if strcmp(varargin{1}, 'error')
+            do_sparse = false;
+            error = varargin{2};
+            if length(varargin) > 2
+                ompparams = varargin{3:end};
+            else
+                ompparams = {'checkdict', 'off'};
+            end
+        else
+            do_sparse = true;
+            ompparams = varargin;
+        end
+    end
+    if isempty(Kz)      % when OMP applied on training signals
+      Kz = K;
+    end
+    if do_sparse
+        X = omp_sparse(A'*Kz, A'*K*A, s, ompparams{:});
+    else
+        X = omp_err(A'*Kz, sum(K), A'*K*A, error, ompparams{:});
+    end
+end
diff --git a/omp_mask.m b/omp_mask.m
new file mode 100644
index 0000000000000000000000000000000000000000..9e10660296fab9860dfaaca225d5a8847164b518
--- /dev/null
+++ b/omp_mask.m
@@ -0,0 +1,41 @@
+% Permission to use, copy, modify, and/or distribute this software for any
+% purpose with or without fee is hereby granted, provided that the above
+% copyright notice and this permission notice appear in all copies.
+% 
+% THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+% WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+% MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+% ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+% WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+% ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+% OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+
+function [X, shared] = omp_mask(Y, D, s, shared, varargin)
+
+% Orthogonal Matching Pursuit algorithm for data with missing entries.
+% The missing entries are simply ignored (masked). 
+% Input:
+%   Y     - training signals set
+%   Ymask - varargin{1} = signal mask
+%   D     - current dictionary
+%   s     - sparsity level
+%
+% Output:
+%   X     - sparse representations
+
+[~,N] = size(Y);
+
+Ymask = varargin{1};
+Y = Y .* Ymask;    % apply mask, to be sure
+
+ompparams = {'checkdict', 'off'};
+
+for i = 1:N    % represent signals one by one, since the available entries are different
+  suppy = find(Ymask(:,i));
+  Dsmall = D(suppy,:);
+  normDsmall = sqrt( sum( Dsmall.*Dsmall ) );
+  Dsmall = Dsmall ./ repmat(normDsmall, size(Dsmall,1), 1);
+%  X(:,i) = omp(Y(suppy,i), Dsmall, s, shared, varargin);
+  X(:,i) = omp_sparse(Dsmall'*Y(suppy,i), Dsmall'*Dsmall, s, ompparams{:});
+  X(:,i) = X(:,i) ./ normDsmall';
+end
diff --git a/omp_sparse.m b/omp_sparse.m
new file mode 100644
index 0000000000000000000000000000000000000000..f1b65ebf30fb931140287ff6a7f1ca0d3b58be18
--- /dev/null
+++ b/omp_sparse.m
@@ -0,0 +1,180 @@
+function gamma = omp_sparse(varargin)
+%OMP Sparsity-constrained Orthogonal Matching Pursuit.
+%  GAMMA = OMP(D,X,G,T) solves the optimization problem
+%
+%       min  |X - D*GAMMA|_2     s.t.  |GAMMA|_0 <= T
+%      gamma
+%
+%  for each of the signals in X, using Batch Orthogonal Matching Pursuit.
+%  Here, D is a dictionary with normalized columns, X is a matrix
+%  containing column signals, T is the # of non-zeros in each signal
+%  representation, and G is the Gramm matrix D'*D. The output GAMMA is a
+%  matrix containing the sparse representations as its columns. 
+%
+%  GAMMA = OMP(D,X,[],T) performs the same operation, but without the
+%  matrix G, using OMP-Cholesky. This call produces the same output as
+%  Batch-OMP, but is significantly slower. Using this syntax is only
+%  recommended when available memory is too small to store G.
+%
+%  GAMMA = OMP(DtX,G,T) is the fastest implementation of OMP, but also
+%  requires the most memory. Here, DtX stores the projections D'*X. In this
+%  case Batch-OMP is used, but without having to compute D'*X in advance,
+%  which slightly improves runtime. Note that in general, the call
+%
+%    GAMMA = OMP(D'*X,G,T);
+%
+%  will be faster than the call
+%
+%    GAMMA = OMP(D,X,G,T);
+%
+%  due to optimized matrix multiplications in Matlab. However, when the
+%  entire matrix D'*X cannot be stored in memory, one of the other two
+%  versions can be used. Both compute D'*X for just one signal at a time,
+%  and thus require much less memory.
+%
+%  GAMMA = OMP(...,PARAM1,VAL1,PARAM2,VAL2,...) specifies additional
+%  parameters for OMP. Available parameters are:
+%
+%    'gammamode' - Specifies the representation mode for GAMMA. Can be
+%                  either 'full' or 'sparse', corresponding to a full or
+%                  sparse matrix, respectively. By default, GAMMA is
+%                  returned as a sparse matrix.
+%    'messages'  - Specifies whether progress messages should be displayed.
+%                  When positive, this is the number of seconds between
+%                  status prints. When negative, indicates that no messages
+%                  should be displayed (this is the default).
+%    'checkdict' - Specifies whether dictionary normalization should be
+%                  verified. When set to 'on' (default) the dictionary
+%                  atoms are verified to be of unit L2-norm. Setting this
+%                  parameter to 'off' disables verification and accelerates
+%                  function performance. Note that an unnormalized
+%                  dictionary will produce invalid results.
+%    'profile'   - Can be either 'on' or 'off'. When 'on', profiling
+%                  information is displayed at the end of the funciton
+%                  execution.
+%
+%
+%  Summary of OMP versions:
+%
+%    version      |   speed     |   memory
+%  --------------------------------------------------
+%   OMP(DtX,G,T)  |  very fast  |  very large
+%   OMP(D,X,G,T)  |  fast       |  moderate
+%   OMP(D,X,[],T) |  very slow  |  small
+%  --------------------------------------------------
+%
+%
+%  References:
+%  [1] M. Elad, R. Rubinstein, and M. Zibulevsky, "Efficient Implementation
+%      of the K-SVD Algorithm using Batch Orthogonal Matching Pursuit",
+%      Technical Report - CS, Technion, April 2008.
+%
+%  See also OMP2.
+
+
+%  Ron Rubinstein
+%  Computer Science Department
+%  Technion, Haifa 32000 Israel
+%  ronrubin@cs
+%
+%  April 2009
+
+
+% default options
+
+sparse_gamma = 0;
+msgdelta = -1;
+checkdict = 1;
+profile = 0;
+
+
+% determine number of parameters
+
+paramnum = 1;
+while (paramnum<=nargin && ~ischar(varargin{paramnum}))
+  paramnum = paramnum+1;
+end
+paramnum = paramnum-1;
+
+
+% parse options
+
+for i = paramnum+1:2:length(varargin)
+  paramname = varargin{i};
+  paramval = varargin{i+1};
+
+  switch lower(paramname)
+
+    case 'gammamode'
+      if (strcmpi(paramval,'sparse'))
+        sparse_gamma = 1;
+      elseif (strcmpi(paramval,'full'))
+        sparse_gamma = 0;
+      else
+        error('Invalid GAMMA mode');
+      end
+      
+    case 'messages'
+      msgdelta = paramval;
+
+    case 'checkdict'
+      if (strcmpi(paramval,'on'))
+        checkdict = 1;
+      elseif (strcmpi(paramval,'off'))
+        checkdict = 0;
+      else
+        error('Invalid checkdict option');
+      end
+
+    case 'profile'
+      if (strcmpi(paramval,'on'))
+        profile = 1;
+      elseif (strcmpi(paramval,'off'))
+        profile = 0;
+      else
+        error('Invalid profile mode');
+      end
+
+    otherwise
+      error(['Unknown option: ' paramname]);
+  end
+  
+end
+
+
+% determine call type
+
+if (paramnum==3)
+  DtX = varargin{1};
+  G = varargin{2};
+  T = varargin{3};
+  D = [];
+  X = [];
+elseif (paramnum==4)
+  D = varargin{1};
+  X = varargin{2};
+  G = varargin{3};
+  T = varargin{4};
+  DtX = [];
+else
+  error('Invalid number of parameters');
+end
+
+
+% verify dictionary normalization
+
+if (checkdict)
+  if (isempty(G))
+    atomnorms = sum(D.*D);
+  else
+    atomnorms = diag(G);
+  end
+  if (any(abs(atomnorms-1) > 1e-2))
+    error('Dictionary columns must be normalized to unit length');
+  end
+end
+
+
+% omp
+
+gamma = ompmex(D,X,DtX,G,T,sparse_gamma,msgdelta,profile);
diff --git a/private/make.m b/private/make.m
new file mode 100755
index 0000000000000000000000000000000000000000..aae72b76c0bcf3127b9c79bb141f39d9efa49007
--- /dev/null
+++ b/private/make.m
@@ -0,0 +1,40 @@
+function make
+%MAKE Build the OMPBox package.
+%  MAKE compiles all OMPBox MEX functions, using Matlab's default MEX
+%  compiler. If the MEX compiler has not been set-up before, please run
+%
+%    mex -setup
+%
+%  before using this MAKE file.
+
+%  Ron Rubinstein
+%  Computer Science Department
+%  Technion, Haifa 32000 Israel
+%  ronrubin@cs
+%
+%  August 2009
+
+
+% detect platform 
+
+compstr = computer;
+is64bit = strcmp(compstr(end-1:end),'64');
+
+
+% compilation parameters
+
+compile_params = cell(0);
+if (is64bit)
+  compile_params{1} = '-largeArrayDims';
+end
+
+% Compile files %
+
+ompsources = {'mexutils.c','ompcore.c','omputils.c','myblas.c','ompprof.c'};
+
+disp('Compiling ompmex...');
+mex('ompmex.c', ompsources{:},compile_params{:});
+
+disp('Compiling omp2mex...');
+mex('omp2mex.c',ompsources{:},compile_params{:});
+
diff --git a/private/mexutils.c b/private/mexutils.c
new file mode 100755
index 0000000000000000000000000000000000000000..169f52392450c3a8ff9a0b25cd27a5d62221f57d
--- /dev/null
+++ b/private/mexutils.c
@@ -0,0 +1,79 @@
+/**************************************************************************
+ *
+ * File name: mexutils.c
+ *
+ * Ron Rubinstein
+ * Computer Science Department
+ * Technion, Haifa 32000 Israel
+ * ronrubin@cs
+ *
+ * Last Updated: 15.8.2009
+ *
+ *************************************************************************/
+
+#include "mexutils.h"
+#include <math.h>
+
+
+
+/* verify that the mxArray contains a double matrix */
+
+void checkmatrix(const mxArray *param, char *fname, char *pname)
+{
+  char errmsg[100];
+  sprintf(errmsg, "%.15s requires that %.25s be a double matrix.", fname, pname);
+  if (!mxIsDouble(param) || mxIsComplex(param) || mxGetNumberOfDimensions(param)>2) {
+    mexErrMsgTxt(errmsg);
+  }
+}
+
+
+/* verify that the mxArray contains a 1-D double vector */
+
+void checkvector(const mxArray *param, char *fname, char *pname)
+{
+  char errmsg[100];
+  sprintf(errmsg, "%.15s requires that %.25s be a double vector.", fname, pname);
+  if (!mxIsDouble(param) || mxIsComplex(param) || mxGetNumberOfDimensions(param)>2 || (mxGetM(param)!=1 && mxGetN(param)!=1)) {
+    mexErrMsgTxt(errmsg);
+  }
+}
+
+
+/* verify that the mxArray contains a double scalar */
+
+void checkscalar(const mxArray *param, char *fname, char *pname)
+{
+  char errmsg[100];
+  sprintf(errmsg, "%.15s requires that %.25s be a double scalar.", fname, pname);
+  if (!mxIsDouble(param) || mxIsComplex(param) || mxGetNumberOfDimensions(param)>2 || 
+      mxGetM(param)!=1 || mxGetN(param)!=1) 
+  {
+    mexErrMsgTxt(errmsg);
+  }
+}
+
+
+/* verify that the mxArray contains a sparse matrix */
+
+void checksparse(const mxArray *param, char *fname, char *pname)
+{
+  char errmsg[100];
+  sprintf(errmsg, "%.15s requires that %.25s be sparse.", fname, pname);
+  if (!mxIsSparse(param)) {
+    mexErrMsgTxt(errmsg);
+  }
+}
+
+
+/* verify that the mxArray contains a 1-dimensional cell array */
+
+void checkcell_1d(const mxArray *param, char *fname, char *pname)
+{
+  char errmsg[100];
+  sprintf(errmsg, "%.15s requires that %.25s be a 1-D cell array.", fname, pname);
+  if (!mxIsCell(param) || mxGetNumberOfDimensions(param)>2 || (mxGetM(param)!=1 && mxGetN(param)!=1)) {
+    mexErrMsgTxt(errmsg);
+  }
+}
+
diff --git a/private/mexutils.h b/private/mexutils.h
new file mode 100755
index 0000000000000000000000000000000000000000..c800cbd23f5fa477859894edb24a25e635abe9e0
--- /dev/null
+++ b/private/mexutils.h
@@ -0,0 +1,103 @@
+/**************************************************************************
+ *
+ * File name: mexutils.h
+ *
+ * Ron Rubinstein
+ * Computer Science Department
+ * Technion, Haifa 32000 Israel
+ * ronrubin@cs
+ *
+ * Last Updated: 18.8.2009
+ *
+ * Utility functions for MEX files.
+ *
+ *************************************************************************/
+
+
+#ifndef __MEX_UTILS_H__
+#define __MEX_UTILS_H__
+
+#include "mex.h"
+
+
+
+/**************************************************************************
+ * Function checkmatrix:
+ *
+ * Verify that the specified mxArray is real, of type double, and has 
+ * no more than two dimensions. If not, an error message is printed
+ * and the mex file terminates.
+ * 
+ * Parameters:
+ *   param - the mxArray to be checked
+ *   fname - the name of the function where the error occured (15 characters or less)
+ *   pname - the name of the parameter (25 characters or less)
+ *
+ **************************************************************************/
+void checkmatrix(const mxArray *param, char *fname, char *pname);
+
+
+/**************************************************************************
+ * Function checkvector:
+ *
+ * Verify that the specified mxArray is 1-D, real, and of type double. The
+ * vector may be a column or row vector. Otherwise, an error message is
+ * printed and the mex file terminates.
+ * 
+ * Parameters:
+ *   param - the mxArray to be checked
+ *   fname - the name of the function where the error occured (15 characters or less)
+ *   pname - the name of the parameter (25 characters or less)
+ *
+ **************************************************************************/
+void checkvector(const mxArray *param, char *fname, char *pname);
+
+
+/**************************************************************************
+ * Function checkscalar:
+ *
+ * Verify that the specified mxArray represents a real double scalar value. 
+ * If not, an error message is printed and the mex file terminates.
+ * 
+ * Parameters:
+ *   param - the mxArray to be checked
+ *   fname - the name of the function where the error occured (15 characters or less)
+ *   pname - the name of the parameter (25 characters or less)
+ *
+ **************************************************************************/
+void checkscalar(const mxArray *param, char *fname, char *pname);
+
+
+/**************************************************************************
+ * Function checksparse:
+ *
+ * Verify that the specified mxArray contains a sparse matrix. If not,
+ * an error message is printed and the mex file terminates.
+ * 
+ * Parameters:
+ *   param - the mxArray to be checked
+ *   fname - the name of the function where the error occured (15 characters or less)
+ *   pname - the name of the parameter (25 characters or less)
+ *
+ **************************************************************************/
+void checksparse(const mxArray *param, char *fname, char *pname);
+
+
+/**************************************************************************
+ * Function checkcell_1d:
+ *
+ * Verify that the specified mxArray is a 1-D cell array. The cell array 
+ * may be arranged as either a column or a row. If not, an error message 
+ * is printed and the mex file terminates.
+ * 
+ * Parameters:
+ *   param - the mxArray to be checked
+ *   fname - the name of the function where the error occured (15 characters or less)
+ *   pname - the name of the parameter (25 characters or less)
+ *
+ **************************************************************************/
+void checkcell_1d(const mxArray *param, char *fname, char *pname);
+
+
+#endif
+
diff --git a/private/mexutils.mexw64.manifest b/private/mexutils.mexw64.manifest
new file mode 100755
index 0000000000000000000000000000000000000000..1c06b6190a8d22bbcdd04212af9d1fce14d5f465
--- /dev/null
+++ b/private/mexutils.mexw64.manifest
@@ -0,0 +1,10 @@
+<?xml version='1.0' encoding='UTF-8' standalone='yes'?>
+<assembly xmlns='urn:schemas-microsoft-com:asm.v1' manifestVersion='1.0'>
+  <trustInfo xmlns="urn:schemas-microsoft-com:asm.v3">
+    <security>
+      <requestedPrivileges>
+        <requestedExecutionLevel level='asInvoker' uiAccess='false' />
+      </requestedPrivileges>
+    </security>
+  </trustInfo>
+</assembly>
diff --git a/private/mexutils.mexw64.map b/private/mexutils.mexw64.map
new file mode 100755
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/private/myblas.c b/private/myblas.c
new file mode 100755
index 0000000000000000000000000000000000000000..06791b0b45479951451a1ef7df850ae7bbc4f7f5
--- /dev/null
+++ b/private/myblas.c
@@ -0,0 +1,663 @@
+/**************************************************************************
+ *
+ * File name: myblas.c
+ *
+ * Ron Rubinstein
+ * Computer Science Department
+ * Technion, Haifa 32000 Israel
+ * ronrubin@cs
+ *
+ * Version: 1.1
+ * Last updated: 13.8.2009
+ *
+ *************************************************************************/
+
+
+#include "myblas.h"
+#include <ctype.h>
+
+
+/* find maximum of absolute values */
+
+mwIndex maxabs(double c[], mwSize m)
+{
+  mwIndex maxid=0, k;
+  double absval, maxval = SQR(*c);   /* use square which is quicker than absolute value */
+
+  for (k=1; k<m; ++k) {
+    absval = SQR(c[k]);
+    if (absval > maxval) {
+      maxval = absval;
+      maxid = k;
+    }
+  }
+  return maxid;
+}
+
+
+/* compute y := alpha*x + y */
+
+void vec_sum(double alpha, double x[], double y[], mwSize n)
+{
+  mwIndex i;
+
+  for (i=0; i<n; ++i) {
+    y[i] += alpha*x[i];
+  }
+}
+
+
+/* compute y := alpha*A*x */
+
+void mat_vec(double alpha, double A[], double x[], double y[], mwSize n, mwSize m)
+{
+  mwIndex i, j, i_n;
+  double *Ax;
+
+  Ax = mxCalloc(n,sizeof(double));
+
+  for (i=0; i<m; ++i) {
+    i_n = i*n;
+    for (j=0; j<n; ++j) {
+      Ax[j] += A[i_n+j] * x[i];
+    }
+  }
+
+  for (j=0; j<n; ++j) {
+    y[j] = alpha*Ax[j];
+  }
+
+  mxFree(Ax);
+}
+
+
+/* compute y := alpha*A'*x */
+
+void matT_vec(double alpha, double A[], double x[], double y[], mwSize n, mwSize m)
+{
+  mwIndex i, j, n_i;
+  double sum0, sum1, sum2, sum3;
+
+  for (j=0; j<m; ++j) {
+    y[j] = 0;
+  }
+
+  /* use loop unrolling to accelerate computation */
+
+  for (i=0; i<m; ++i) {
+    n_i = n*i;
+    sum0 = sum1 = sum2 = sum3 = 0;
+    for (j=0; j+4<n; j+=4) {
+      sum0 += A[n_i+j]*x[j];
+      sum1 += A[n_i+j+1]*x[j+1];
+      sum2 += A[n_i+j+2]*x[j+2];
+      sum3 += A[n_i+j+3]*x[j+3];
+    }
+    y[i] += alpha * ((sum0 + sum1) + (sum2 + sum3));
+    while (j<n) {
+      y[i] += alpha*A[n_i+j]*x[j];
+      j++;
+    }
+  }
+}
+
+
+/* compute y := alpha*A*x */
+
+void mat_sp_vec(double alpha, double pr[], mwIndex ir[], mwIndex jc[], double x[], double y[], mwSize n, mwSize m)
+{
+  
+  mwIndex i, j, j1, j2;
+
+  for (i=0; i<n; ++i) {
+    y[i] = 0;
+  }
+  
+  j2 = jc[0];
+  for (i=0; i<m; ++i) {
+    j1 = j2; j2 = jc[i+1];
+    for (j=j1; j<j2; ++j) {
+      y[ir[j]] += alpha * pr[j] * x[i];
+    }
+  }
+  
+}
+
+
+/* compute y := alpha*A'*x */
+
+void matT_sp_vec(double alpha, double pr[], mwIndex ir[], mwIndex jc[], double x[], double y[], mwSize n, mwSize m)
+{
+  
+  mwIndex i, j, j1, j2;
+  
+  for (i=0; i<m; ++i) {
+    y[i] = 0;
+  }
+  
+  j2 = jc[0];
+  for (i=0; i<m; ++i) {
+    j1 = j2; j2 = jc[i+1];
+    for (j=j1; j<j2; ++j) {
+      y[i] += alpha * pr[j] * x[ir[j]];
+    }
+  }
+  
+}
+
+
+/* compute y := alpha*A*x */
+
+void mat_vec_sp(double alpha, double A[], double pr[], mwIndex ir[], mwIndex jc[], double y[], mwSize n, mwSize m)
+{
+  
+  mwIndex i, j, j_n, k, kend;
+  
+  for (i=0; i<n; ++i) {
+    y[i] = 0;
+  }
+  
+  kend = jc[1];
+  if (kend==0) {   /* x is empty */
+    return;
+  }
+  
+  for (k=0; k<kend; ++k) {
+    j = ir[k];
+    j_n = j*n;
+    for (i=0; i<n; ++i) {
+      y[i] += alpha * A[i+j_n] * pr[k];
+    }
+  }
+
+}
+
+
+/* compute y := alpha*A'*x */
+
+void matT_vec_sp(double alpha, double A[], double pr[], mwIndex ir[], mwIndex jc[], double y[], mwSize n, mwSize m)
+{
+  
+  mwIndex i, j, j_n, k, kend;
+  
+  for (i=0; i<m; ++i) {
+    y[i] = 0;
+  }
+  
+  kend = jc[1];
+  if (kend==0) {   /* x is empty */
+    return;
+  }
+  
+  for (j=0; j<m; ++j) {
+    j_n = j*n;
+    for (k=0; k<kend; ++k) {
+      i = ir[k];
+      y[j] += alpha * A[i+j_n] * pr[k];
+    }
+  }
+  
+}
+
+
+/* compute y := alpha*A*x */
+
+void mat_sp_vec_sp(double alpha, double pr[], mwIndex ir[], mwIndex jc[], double prx[], mwIndex irx[], mwIndex jcx[], double y[], mwSize n, mwSize m)
+{
+  
+  mwIndex i, j, k, kend, j1, j2;
+
+  for (i=0; i<n; ++i) {
+    y[i] = 0;
+  }
+  
+  kend = jcx[1]; 
+  if (kend==0) {   /* x is empty */
+    return;
+  }
+  
+  for (k=0; k<kend; ++k) {
+    i = irx[k];
+    j1 = jc[i]; j2 = jc[i+1];
+    for (j=j1; j<j2; ++j) {
+      y[ir[j]] += alpha * pr[j] * prx[k];
+    }
+  }
+  
+}
+
+
+/* compute y := alpha*A'*x */
+
+void matT_sp_vec_sp(double alpha, double pr[], mwIndex ir[], mwIndex jc[], double prx[], mwIndex irx[], mwIndex jcx[], double y[], mwSize n, mwSize m)
+{
+  
+  mwIndex i, j, k, jend, kend, jadd, kadd, delta;
+  
+  for (i=0; i<m; ++i) {
+    y[i] = 0;
+  }
+  
+  kend = jcx[1];
+  if (kend==0) {   /* x is empty */
+    return;
+  }
+  
+  for (i=0; i<m; ++i) {
+    j = jc[i]; 
+    jend = jc[i+1];
+    k = 0;
+    while (j<jend && k<kend) {
+      
+      delta = ir[j] - irx[k];
+      
+      if (delta) { /* if indices differ - increment the smaller one */
+        jadd = delta<0;
+        kadd = 1-jadd;
+        j += jadd;
+        k += kadd;
+      }
+      
+      else {    /* indices are equal - add to result and increment both */
+        y[i] += alpha * pr[j] * prx[k];
+        j++; k++;
+      }
+    }
+  }
+  
+}
+
+
+/* matrix-matrix multiplication */
+
+void mat_mat(double alpha, double A[], double B[], double X[], mwSize n, mwSize m, mwSize k)
+{
+  mwIndex i1, i2, i3, iX, iA, i2_n;
+  double b;
+  
+  for (i1=0; i1<n*k; i1++) {
+    X[i1] = 0;
+  }
+
+  for (i2=0; i2<m; ++i2) {
+    i2_n = i2*n;
+    iX = 0;
+    for (i3=0; i3<k; ++i3) {
+      iA = i2_n;
+      b = B[i2+i3*m];
+      for (i1=0; i1<n; ++i1) {
+        X[iX++] += A[iA++]*b;
+      }
+    }
+  }
+  
+  for (i1=0; i1<n*k; i1++) {
+    X[i1] *= alpha;
+  }
+}
+
+
+/* matrix-transpose-matrix multiplication */
+
+void matT_mat(double alpha, double A[], double B[], double X[], mwSize n, mwSize m, mwSize k)
+{
+  mwIndex i1, i2, i3, iX, iA, i2_n;
+  double *x, sum0, sum1, sum2, sum3;
+
+  for (i2=0; i2<m; ++i2) {
+    for (i3=0; i3<k; ++i3) {
+      sum0 = sum1 = sum2 = sum3 = 0;
+      for (i1=0; i1+4<n; i1+=4) {
+        sum0 += A[i1+0+i2*n]*B[i1+0+i3*n];
+        sum1 += A[i1+1+i2*n]*B[i1+1+i3*n];
+        sum2 += A[i1+2+i2*n]*B[i1+2+i3*n];
+        sum3 += A[i1+3+i2*n]*B[i1+3+i3*n];
+      }
+      X[i2+i3*m] = (sum0+sum1) + (sum2+sum3);
+      while(i1<n) {
+        X[i2+i3*m] += A[i1+i2*n]*B[i1+i3*n];
+        i1++;
+      }
+    }
+  }
+  
+  for (i1=0; i1<m*k; i1++) {
+    X[i1] *= alpha;
+  }
+}
+
+
+/* tensor-matrix product */
+
+void tens_mat(double alpha, double A[], double B[], double X[], mwSize n, mwSize m, mwSize k, mwSize l)
+{
+  mwIndex i1, i2, i3, i4, i2_n, nml;
+  double b;
+  
+  nml = n*m*l;
+  for (i1=0; i1<nml; ++i1) {
+    X[i1] = 0;
+  }
+
+  for (i2=0; i2<m; ++i2) {
+    i2_n = i2*n;
+    for (i3=0; i3<k; ++i3) {
+      for (i4=0; i4<l; ++i4) {
+        b = B[i4+i3*l];
+        for (i1=0; i1<n; ++i1) {
+          X[i1 + i2_n + i4*n*m] += A[i1 + i2_n + i3*n*m] * b;
+        }
+      }
+    }
+  }
+  
+  for (i1=0; i1<nml; ++i1) {
+    X[i1] *= alpha;
+  }
+}
+
+
+/* tensor-matrix-transpose product */
+
+void tens_matT(double alpha, double A[], double B[], double X[], mwSize n, mwSize m, mwSize k, mwSize l)
+{
+  mwIndex i1, i2, i3, i4, i2_n, nml;
+  double b;
+  
+  nml = n*m*l;
+  for (i1=0; i1<nml; ++i1) {
+    X[i1] = 0;
+  }
+
+  for (i2=0; i2<m; ++i2) {
+    i2_n = i2*n;
+    for (i4=0; i4<l; ++i4) {
+      for (i3=0; i3<k; ++i3) {
+        b = B[i3+i4*k];
+        for (i1=0; i1<n; ++i1) {
+          X[i1 + i2_n + i4*n*m] += A[i1 + i2_n + i3*n*m] * b;
+        }
+      }
+    }
+  }
+  
+  for (i1=0; i1<nml; ++i1) {
+    X[i1] *= alpha;
+  }
+}
+
+
+/* dot product */
+
+double dotprod(double a[], double b[], mwSize n)
+{
+  double sum = 0;
+  mwIndex i;
+  for (i=0; i<n; ++i)
+    sum += a[i]*b[i];
+  return sum;
+}
+
+
+/* find maximum of vector */
+
+mwIndex maxpos(double c[], mwSize m)
+{
+  mwIndex maxid=0, k;
+  double val, maxval = *c;
+
+  for (k=1; k<m; ++k) {
+    val = c[k];
+    if (val > maxval) {
+      maxval = val;
+      maxid = k;
+    }
+  }
+  return maxid;
+}
+
+
+/* solve L*x = b */
+
+void backsubst_L(double L[], double b[], double x[], mwSize n, mwSize k)
+{
+  mwIndex i, j;
+  double rhs;
+
+  for (i=0; i<k; ++i) {
+    rhs = b[i];
+    for (j=0; j<i; ++j) {
+      rhs -= L[j*n+i]*x[j];
+    }
+    x[i] = rhs/L[i*n+i];
+  }
+}
+
+
+/* solve L'*x = b */
+
+void backsubst_Lt(double L[], double b[], double x[], mwSize n, mwSize k)
+{
+  mwIndex i, j;
+  double rhs;
+
+  for (i=k; i>=1; --i) {
+    rhs = b[i-1];
+    for (j=i; j<k; ++j) {
+      rhs -= L[(i-1)*n+j]*x[j];
+    }
+    x[i-1] = rhs/L[(i-1)*n+i-1];
+  }
+}
+
+
+/* solve U*x = b */
+
+void backsubst_U(double U[], double b[], double x[], mwSize n, mwSize k)
+{
+  mwIndex i, j;
+  double rhs;
+
+  for (i=k; i>=1; --i) {
+    rhs = b[i-1];
+    for (j=i; j<k; ++j) {
+      rhs -= U[j*n+i-1]*x[j];
+    }
+    x[i-1] = rhs/U[(i-1)*n+i-1];
+  }
+}
+
+
+/* solve U'*x = b */
+
+void backsubst_Ut(double U[], double b[], double x[], mwSize n, mwSize k)
+{
+  mwIndex i, j;
+  double rhs;
+
+  for (i=0; i<k; ++i) {
+    rhs = b[i];
+    for (j=0; j<i; ++j) {
+      rhs -= U[i*n+j]*x[j];
+    }
+    x[i] = rhs/U[i*n+i];
+  }
+}
+
+
+/* back substitution solver */
+
+void backsubst(char ul, double A[], double b[], double x[], mwSize n, mwSize k)
+{
+  if (tolower(ul) == 'u') {
+    backsubst_U(A, b, x, n, k);
+  }
+  else if (tolower(ul) == 'l') {
+    backsubst_L(A, b, x, n, k);
+  }
+  else {
+    mexErrMsgTxt("Invalid triangular matrix type: must be ''U'' or ''L''");
+  }
+}
+
+
+/* solve equation set using cholesky decomposition */
+
+void cholsolve(char ul, double A[], double b[], double x[], mwSize n, mwSize k)
+{
+  double *tmp;
+
+  tmp = mxMalloc(k*sizeof(double));
+
+  if (tolower(ul) == 'l') {
+    backsubst_L(A, b, tmp, n, k);
+    backsubst_Lt(A, tmp, x, n, k);
+  }
+  else if (tolower(ul) == 'u') {
+    backsubst_Ut(A, b, tmp, n, k);
+    backsubst_U(A, tmp, x, n, k);
+  }
+  else {
+    mexErrMsgTxt("Invalid triangular matrix type: must be either ''U'' or ''L''");
+  }
+
+  mxFree(tmp);
+}
+
+
+/* perform a permutation assignment y := x(ind(1:k)) */
+
+void vec_assign(double y[], double x[], mwIndex ind[], mwSize k)
+{
+  mwIndex i;
+
+  for (i=0; i<k; ++i)
+    y[i] = x[ind[i]];
+}
+
+
+/* matrix transpose */
+
+void transpose(double X[], double Y[], mwSize n, mwSize m)
+{
+  mwIndex i, j, i_m, j_n;
+  
+  if (n<m) {
+    for (j=0; j<m; ++j) {
+      j_n = j*n;
+      for (i=0; i<n; ++i) {
+        Y[j+i*m] = X[i+j_n];
+      }
+    }
+  }
+  else {
+    for (i=0; i<n; ++i) {
+      i_m = i*m;
+      for (j=0; j<m; ++j) {
+        Y[j+i_m] = X[i+j*n];
+      }
+    }
+  }
+}
+
+
+/* print contents of matrix */
+
+void printmat(double A[], int n, int m, char* matname)
+{
+  int i, j;
+  mexPrintf("\n%s = \n\n", matname);
+
+  if (n*m==0) {
+    mexPrintf("   Empty matrix: %d-by-%d\n\n", n, m);
+    return;
+  }
+
+  for (i=0; i<n; ++i) {
+    for (j=0; j<m; ++j)
+      mexPrintf("   %lf", A[j*n+i]);
+    mexPrintf("\n");
+  }
+  mexPrintf("\n");
+}
+
+
+/* print contents of sparse matrix */
+
+void printspmat(mxArray *a, char* matname)
+{
+  mwIndex *aJc = mxGetJc(a);
+  mwIndex *aIr = mxGetIr(a);
+  double *aPr = mxGetPr(a);
+
+  int i;
+
+  mexPrintf("\n%s = \n\n", matname);
+
+  for (i=0; i<aJc[1]; ++i)
+    printf("   (%d,1) = %lf\n", aIr[i]+1,aPr[i]);
+
+  mexPrintf("\n");
+}
+
+
+
+/* matrix multiplication using Winograd's algorithm */
+
+/*
+void mat_mat2(double alpha, double A[], double B[], double X[], mwSize n, mwSize m, mwSize k)
+{
+  
+  mwIndex i1, i2, i3, iX, iA, i2_n;
+  double b, *AA, *BB;
+  
+  AA = mxCalloc(n,sizeof(double));
+  BB = mxCalloc(k,sizeof(double));
+  
+  for (i1=0; i1<n*k; i1++) {
+    X[i1] = 0;
+  }
+  
+  for (i1=0; i1<n; ++i1) {
+    for (i2=0; i2<m/2; ++i2) {
+      AA[i1] += A[i1+2*i2*n]*A[i1+(2*i2+1)*n];
+    }
+  }
+
+  for (i2=0; i2<k; ++i2) {
+    for (i1=0; i1<m/2; ++i1) {
+      BB[i2] += B[2*i1+i2*m]*B[2*i1+1+i2*m];
+    }
+  }
+
+  for (i2=0; i2<k; ++i2) {
+    for (i3=0; i3<m/2; ++i3) {
+      for (i1=0; i1<n; ++i1) {
+        X[i1+i2*n] += (A[i1+(2*i3)*n]+B[2*i3+1+i2*m])*(A[i1+(2*i3+1)*n]+B[2*i3+i2*m]);
+      }
+    }
+  }
+  
+  if (m%2) {
+    for (i2=0; i2<k; ++i2) {
+      for (i1=0; i1<n; ++i1) {
+        X[i1+i2*n] += A[i1+(m-1)*n]*B[m-1+i2*m];
+      }
+    }
+  }
+  
+  for (i2=0; i2<k; ++i2) {
+    for (i1=0; i1<n; ++i1) {
+      X[i1+i2*n] -= (AA[i1] + BB[i2]);
+      X[i1+i2*n] *= alpha;
+    }
+  }
+  
+  mxFree(AA);
+  mxFree(BB);
+}
+*/
+
+
+
+
diff --git a/private/myblas.h b/private/myblas.h
new file mode 100755
index 0000000000000000000000000000000000000000..17df5ece0b17d154ef68394112ba5d9dc8517279
--- /dev/null
+++ b/private/myblas.h
@@ -0,0 +1,492 @@
+/**************************************************************************
+ *
+ * File name: myblas.h
+ *
+ * Ron Rubinstein
+ * Computer Science Department
+ * Technion, Haifa 32000 Israel
+ * ronrubin@cs
+ *
+ * Version: 1.1
+ * Last updated: 17.8.2009
+ *
+ * A collection of basic linear algebra functions, in the spirit of the
+ * BLAS/LAPACK libraries.
+ *
+ *************************************************************************/
+
+
+
+#ifndef __MY_BLAS_H__
+#define __MY_BLAS_H__
+
+
+#include "mex.h"
+#include <math.h>
+
+
+
+/**************************************************************************
+ * Squared value.
+ **************************************************************************/
+#define SQR(X) ((X)*(X))
+
+
+
+/**************************************************************************
+ * Matrix-vector multiplication. 
+ *
+ * Computes an operation of the form:
+ *
+ *   y := alpha*A*x
+ *
+ * Parameters:
+ *   A - matrix of size n X m
+ *   x - vector of length m
+ *   y - output vector of length n
+ *   alpha - real constant
+ *   n, m - dimensions of A
+ *
+ * Note: This function re-writes the contents of y.
+ *
+ **************************************************************************/
+void mat_vec(double alpha, double A[], double x[], double y[], mwSize n, mwSize m);
+
+
+
+/**************************************************************************
+ * Matrix-transpose-vector multiplication. 
+ *
+ * Computes an operation of the form:
+ *
+ *   y := alpha*A'*x
+ *
+ * Parameters:
+ *   A - matrix of size n X m
+ *   x - vector of length n
+ *   y - output vector of length m
+ *   alpha - real constant
+ *   n, m - dimensions of A
+ *
+ * Note: This function re-writes the contents of y.
+ *
+ **************************************************************************/
+void matT_vec(double alpha, double A[], double x[], double y[], mwSize n, mwSize m);
+
+
+
+/**************************************************************************
+ * Sparse-matrix-vector multiplication. 
+ *
+ * Computes an operation of the form:
+ *
+ *   y := alpha*A*x
+ *
+ * where A is a sparse matrix.
+ *
+ * Parameters:
+ *   pr,ir,jc - sparse representation of the matrix A, of size n x m
+ *   x - vector of length m
+ *   y - output vector of length n
+ *   alpha - real constant
+ *   n, m - dimensions of A
+ *
+ * Note: This function re-writes the contents of y.
+ *
+ **************************************************************************/
+void mat_sp_vec(double alpha, double pr[], mwIndex ir[], mwIndex jc[], double x[], double y[], mwSize n, mwSize m);
+
+
+
+/**************************************************************************
+ * Sparse-matrix-transpose-vector multiplication. 
+ *
+ * Computes an operation of the form:
+ *
+ *   y := alpha*A'*x
+ *
+ * where A is a sparse matrix.
+ *
+ * Parameters:
+ *   pr,ir,jc - sparse representation of the matrix A, of size n x m
+ *   x - vector of length m
+ *   y - output vector of length n
+ *   alpha - real constant
+ *   n, m - dimensions of A
+ *
+ * Note: This function re-writes the contents of y.
+ *
+ **************************************************************************/
+void matT_sp_vec(double alpha, double pr[], mwIndex ir[], mwIndex jc[], double x[], double y[], mwSize n, mwSize m);
+
+
+
+/**************************************************************************
+ * Matrix-sparse-vector multiplication. 
+ *
+ * Computes an operation of the form:
+ *
+ *   y := alpha*A*x
+ *
+ * where A is a matrix and x is a sparse vector.
+ *
+ * Parameters:
+ *   A - matrix of size n X m
+ *   pr,ir,jc - sparse representation of the vector x, of length m
+ *   y - output vector of length n
+ *   alpha - real constant
+ *   n, m - dimensions of A
+ *
+ * Note: This function re-writes the contents of y.
+ *
+ **************************************************************************/
+void mat_vec_sp(double alpha, double A[], double pr[], mwIndex ir[], mwIndex jc[], double y[], mwSize n, mwSize m);
+
+
+
+/**************************************************************************
+ * Matrix-transpose-sparse-vector multiplication. 
+ *
+ * Computes an operation of the form:
+ *
+ *   y := alpha*A'*x
+ *
+ * where A is a matrix and x is a sparse vector.
+ *
+ * Parameters:
+ *   A - matrix of size n X m
+ *   pr,ir,jc - sparse representation of the vector x, of length n
+ *   y - output vector of length m
+ *   alpha - real constant
+ *   n, m - dimensions of A
+ *
+ * Note: This function re-writes the contents of y.
+ *
+ **************************************************************************/
+void matT_vec_sp(double alpha, double A[], double pr[], mwIndex ir[], mwIndex jc[], double y[], mwSize n, mwSize m);
+
+
+
+/**************************************************************************
+ * Sparse-matrix-sparse-vector multiplication. 
+ *
+ * Computes an operation of the form:
+ *
+ *   y := alpha*A*x
+ *
+ * where A is a sparse matrix and x is a sparse vector.
+ *
+ * Parameters:
+ *   pr,ir,jc - sparse representation of the matrix A, of size n x m
+ *   prx,irx,jcx - sparse representation of the vector x (of length m)
+ *   y - output vector of length n
+ *   alpha - real constant
+ *   n, m - dimensions of A
+ *
+ * Note: This function re-writes the contents of y.
+ *
+ **************************************************************************/
+void mat_sp_vec_sp(double alpha, double pr[], mwIndex ir[], mwIndex jc[], double prx[], mwIndex irx[], mwIndex jcx[], double y[], mwSize n, mwSize m);
+
+
+
+/**************************************************************************
+ * Sparse-matrix-transpose-sparse-vector multiplication. 
+ *
+ * Computes an operation of the form:
+ *
+ *   y := alpha*A'*x
+ *
+ * where A is a sparse matrix and x is a sparse vector.
+ *
+ * Importnant note: this function is provided for completeness, but is NOT efficient.
+ * If possible, convert x to non-sparse representation and use matT_vec_sp instead.
+ *
+ * Parameters:
+ *   pr,ir,jc - sparse representation of the matrix A, of size n x m
+ *   prx,irx,jcx - sparse representation of the vector x (of length n)
+ *   y - output vector of length n
+ *   alpha - real constant
+ *   n, m - dimensions of A
+ *
+ * Note: This function re-writes the contents of y.
+ *
+ **************************************************************************/
+void matT_sp_vec_sp(double alpha, double pr[], mwIndex ir[], mwIndex jc[], double prx[], mwIndex irx[], mwIndex jcx[], double y[], mwSize n, mwSize m);
+
+
+
+/**************************************************************************
+ * Matrix-matrix multiplication. 
+ *
+ * Computes an operation of the form:
+ *
+ *   X := alpha*A*B
+ *
+ * Parameters:
+ *   A - matrix of size n X m
+ *   B - matrix of size m X k
+ *   X - output matrix of size n X k
+ *   alpha - real constant
+ *   n, m, k - dimensions of A, B
+ *
+ * Note: This function re-writes the contents of X.
+ *
+ **************************************************************************/
+void mat_mat(double alpha, double A[], double B[], double X[], mwSize n, mwSize m, mwSize k);
+
+
+
+/**************************************************************************
+ * Matrix-transpose-matrix multiplication. 
+ *
+ * Computes an operation of the form:
+ *
+ *   X := alpha*A*B
+ *
+ * Parameters:
+ *   A - matrix of size n X m
+ *   B - matrix of size m X k
+ *   X - output matrix of size n X k
+ *   alpha - real constant
+ *   n, m, k - dimensions of A, B
+ *
+ * Note: This function re-writes the contents of X.
+ *
+ **************************************************************************/
+void matT_mat(double alpha, double A[], double B[], double X[], mwSize n, mwSize m, mwSize k);
+
+
+
+/**************************************************************************
+ * Tensor-matrix multiplication. 
+ *
+ * This function accepts a 3-D tensor A of size n X m X k
+ * and a 2-D matrix B of size l X k.
+ * The function computes the 3-D tensor X of size n X m X l, where
+ *
+ *   X(i,j,:) = B*A(i,j,:)
+ *
+ * for all i,j.
+ *
+ * Parameters:
+ *   A - tensor of size n X m X k
+ *   B - matrix of size l X k
+ *   X - output tensor of size n X m X l
+ *   alpha - real constant
+ *   n, m, k, l - dimensions of A, B
+ *
+ * Note: This function re-writes the contents of X.
+ *
+ **************************************************************************/
+void tens_mat(double alpha, double A[], double B[], double X[], mwSize n, mwSize m, mwSize k, mwSize l);
+
+
+
+/**************************************************************************
+ * Tensor-matrix-transpose multiplication. 
+ *
+ * This function accepts a 3-D tensor A of size n X m X k
+ * and a 2-D matrix B of size k X l.
+ * The function computes the 3-D tensor X of size n X m X l, where
+ *
+ *   X(i,j,:) = B'*A(i,j,:)
+ *
+ * for all i,j.
+ *
+ * Parameters:
+ *   A - tensor of size n X m X k
+ *   B - matrix of size k X l
+ *   X - output tensor of size n X m X l
+ *   alpha - real constant
+ *   n, m, k, l - dimensions of A, B
+ *
+ * Note: This function re-writes the contents of X.
+ *
+ **************************************************************************/
+void tens_matT(double alpha, double A[], double B[], double X[], mwSize n, mwSize m, mwSize k, mwSize l);
+
+
+
+/**************************************************************************
+ * Vector-vector sum.
+ *
+ * Computes an operation of the form:
+ *
+ *   y := alpha*x + y
+ *
+ * Parameters:
+ *   x - vector of length n
+ *   y - output vector of length n
+ *   alpha - real constant
+ *   n - length of x,y
+ *
+ * Note: This function re-writes the contents of y.
+ *
+ **************************************************************************/
+void vec_sum(double alpha, double x[], double y[], mwSize n);
+
+
+
+/**************************************************************************
+ * Triangular back substitution.
+ *
+ * Solve the set of linear equations
+ *
+ *   T*x = b
+ *
+ * where T is lower or upper triangular.
+ *
+ * Parameters:
+ *   ul - 'U' for upper triangular, 'L' for lower triangular
+ *   A  - matrix of size n x m containing T
+ *   b  - vector of length k
+ *   x  - output vector of length k
+ *   n  - size of first dimension of A
+ *   k  - the size of the equation set, k<=n,m
+ *
+ * Note:
+ *   The matrix A can be of any size n X m, as long as n,m >= k. 
+ *   Only the lower/upper triangle of the submatrix A(1:k,1:k) defines the
+ *   matrix T (depending on the parameter ul).
+ *
+ **************************************************************************/
+void backsubst(char ul, double A[], double b[], double x[], mwSize n, mwSize k);
+
+
+
+/**************************************************************************
+ * Solve a set of equations using a Cholesky decomposition.
+ *
+ * Solve the set of linear equations
+ *
+ *   M*x = b
+ *
+ * where M is positive definite with a known Cholesky decomposition:
+ * either M=L*L' (L lower triangular) or M=U'*U (U upper triangular).
+ *
+ * Parameters:
+ *   ul - 'U' for upper triangular, 'L' for lower triangular decomposition
+ *   A  - matrix of size n x m with the Cholesky decomposition of M
+ *   b  - vector of length k
+ *   x  - output vector of length k
+ *   n  - size of first dimension of A
+ *   k  - the size of the equation set, k<=n,m
+ *
+ * Note:
+ *   The matrix A can be of any size n X m, as long as n,m >= k. 
+ *   Only the lower/upper triangle of the submatrix A(1:k,1:k) is used as
+ *   the Cholesky decomposition of M (depending on the parameter ul).
+ *
+ **************************************************************************/
+void cholsolve(char ul, double A[], double b[], double x[], mwSize n, mwSize k);
+
+
+
+/**************************************************************************
+ * Maximum absolute value.
+ *
+ * Returns the index of the coefficient with maximal absolute value in a vector.
+ *
+ * Parameters:
+ *   x - vector of length n
+ *   n - length of x
+ *
+ **************************************************************************/
+mwIndex maxabs(double x[], mwSize n);
+
+
+
+/**************************************************************************
+ * Maximum vector element.
+ *
+ * Returns the index of the maximal coefficient in a vector.
+ *
+ * Parameters:
+ *   x - vector of length n
+ *   n - length of x
+ *
+ **************************************************************************/
+mwIndex maxpos(double x[], mwSize n);
+
+
+
+/**************************************************************************
+ * Vector-vector dot product.
+ *
+ * Computes an operation of the form:
+ *
+ *   c = a'*b
+ *
+ * Parameters:
+ *   a, b - vectors of length n
+ *   n - length of a,b
+ *
+ * Returns: The dot product c.
+ *
+ **************************************************************************/
+double dotprod(double a[], double b[], mwSize n);
+
+
+
+/**************************************************************************
+ * Indexed vector assignment.
+ *
+ * Perform a permutation assignment of the form
+ *
+ *   y = x(ind)
+ *
+ * where ind is an array of indices to x.
+ *
+ * Parameters:
+ *   y - output vector of length k
+ *   x - input vector of arbitrary length
+ *   ind - array of indices into x (indices begin at 0)
+ *   k - length of the array ind
+ *
+ **************************************************************************/
+void vec_assign(double y[], double x[], mwIndex ind[], mwSize k);
+
+
+
+/**************************************************************************
+ * Matrix transpose.
+ *
+ * Computes Y := X'
+ *
+ * Parameters:
+ *   X - input matrix of size n X m
+ *   Y - output matrix of size m X n
+ *   n, m - dimensions of X
+ *
+ **************************************************************************/
+void transpose(double X[], double Y[], mwSize n, mwSize m);
+
+
+
+/**************************************************************************
+ * Print a matrix.
+ *
+ * Parameters:
+ *   A - matrix of size n X m
+ *   n, m - dimensions of A
+ *   matname - name of matrix to display
+ *
+ **************************************************************************/
+void printmat(double A[], int n, int m, char* matname);
+
+
+
+/**************************************************************************
+ * Print a sparse matrix.
+ *
+ * Parameters:
+ *   A - sparse matrix of type double
+ *   matname - name of matrix to display
+ *
+ **************************************************************************/
+void printspmat(mxArray *A, char* matname);
+
+
+#endif
+
diff --git a/private/omp2mex.c b/private/omp2mex.c
new file mode 100755
index 0000000000000000000000000000000000000000..2b16f4684c3cf9e0853ce06330c99f95bed24774
--- /dev/null
+++ b/private/omp2mex.c
@@ -0,0 +1,156 @@
+/**************************************************************************
+ *
+ * File name: omp2mex.c
+ *
+ * Ron Rubinstein
+ * Computer Science Department
+ * Technion, Haifa 32000 Israel
+ * ronrubin@cs
+ *
+ * Last Updated: 18.8.2009
+ *
+ *************************************************************************/
+
+#include "ompcore.h"
+#include "omputils.h"
+#include "mexutils.h"
+
+
+/* Input Arguments */
+
+#define	IN_D	        prhs[0]
+#define IN_X          prhs[1]
+#define IN_DtX        prhs[2]
+#define IN_XtX        prhs[3]
+#define IN_G          prhs[4]
+#define IN_EPS        prhs[5]
+#define IN_SPARSE_G   prhs[6]
+#define IN_MSGDELTA   prhs[7]
+#define IN_MAXATOMS   prhs[8]
+#define IN_PROFILE    prhs[9]
+
+
+/* Output Arguments */
+
+#define	GAMMA_OUT     plhs[0]
+
+
+/***************************************************************************************/
+
+
+void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray*prhs[])
+
+{
+  double *D, *x, *DtX, *XtX, *G, eps, msgdelta;
+  int gmode, maxatoms, profile;
+  mwSize m, n, L;    /* D is n x m , X is n x L, DtX is m x L */
+
+  
+  /* check parameters */
+  
+  checkmatrix(IN_D, "OMP2", "D");
+  checkmatrix(IN_X, "OMP2", "X");
+  checkmatrix(IN_DtX, "OMP2", "DtX");
+  checkmatrix(IN_XtX, "OMP2", "XtX");
+  checkmatrix(IN_G, "OMP2", "G");
+  
+  checkscalar(IN_EPS, "OMP2", "EPSILON");
+  checkscalar(IN_SPARSE_G, "OMP2", "sparse_g");
+  checkscalar(IN_MSGDELTA, "OMP2", "msgdelta");
+  checkscalar(IN_MAXATOMS, "OMP2", "maxatoms");
+  checkscalar(IN_PROFILE, "OMP2", "profile");
+  
+  
+  /* get parameters */
+  
+  x = D = DtX = XtX = G = 0;
+  
+  if (!mxIsEmpty(IN_D))
+    D = mxGetPr(IN_D);
+  
+  if (!mxIsEmpty(IN_X))
+    x = mxGetPr(IN_X);
+  
+  if (!mxIsEmpty(IN_DtX))
+    DtX = mxGetPr(IN_DtX);
+  
+  if (!mxIsEmpty(IN_XtX))
+    XtX = mxGetPr(IN_XtX);
+  
+  if (!mxIsEmpty(IN_G))
+    G = mxGetPr(IN_G);
+  
+  eps = mxGetScalar(IN_EPS);
+  if ((int)(mxGetScalar(IN_SPARSE_G)+1e-2)) {
+    gmode = SPARSE_GAMMA;
+  }
+  else {
+    gmode = FULL_GAMMA;
+  }
+  msgdelta = mxGetScalar(IN_MSGDELTA);
+  if (mxGetScalar(IN_MAXATOMS) < -1e-5) {
+    maxatoms = -1;
+  }
+  else {
+    maxatoms = (int)(mxGetScalar(IN_MAXATOMS)+1e-2);
+  }
+  profile = (int)(mxGetScalar(IN_PROFILE)+1e-2);
+  
+  
+  /* check sizes */
+  
+  if (D && x) {
+    n = mxGetM(IN_D);
+    m = mxGetN(IN_D);
+    L = mxGetN(IN_X);
+    
+    if (mxGetM(IN_X) != n) {
+      mexErrMsgTxt("D and X have incompatible sizes.");
+    }
+    
+    if (G) {
+      if (mxGetN(IN_G)!=mxGetM(IN_G)) {
+        mexErrMsgTxt("G must be a square matrix.");
+      }
+      if (mxGetN(IN_G) != m) {
+        mexErrMsgTxt("D and G have incompatible sizes.");
+      }
+    }
+  }
+  
+  else if (DtX && XtX) {
+    m = mxGetM(IN_DtX);
+    L = mxGetN(IN_DtX);
+    
+    /* set n to an arbitrary value that is at least the max possible number of selected atoms */
+    
+    if (maxatoms>0) {
+      n = maxatoms;
+    }
+    else {
+      n = m;
+    }
+    
+    if ( !(mxGetM(IN_XtX)==L && mxGetN(IN_XtX)==1) && !(mxGetM(IN_XtX)==1 && mxGetN(IN_XtX)==L) ) {
+      mexErrMsgTxt("DtX and XtX have incompatible sizes.");
+    }
+    
+    if (mxGetN(IN_G)!=mxGetM(IN_G)) {
+      mexErrMsgTxt("G must be a square matrix.");
+    }
+    if (mxGetN(IN_G) != m) {
+      mexErrMsgTxt("DtX and G have incompatible sizes.");
+    }
+  }
+  
+  else {
+    mexErrMsgTxt("Either D and X, or DtX and XtX, must be specified.");
+  }
+  
+  
+  /* Do OMP! */
+  
+  GAMMA_OUT = ompcore(D, x, DtX, XtX, G, n, m, L, maxatoms, eps, gmode, profile, msgdelta, 1);
+  
+  return;
+}
diff --git a/private/omp2mex.m b/private/omp2mex.m
new file mode 100755
index 0000000000000000000000000000000000000000..f6f44039b84d756e1341c41c74a8fb15297d2ea5
--- /dev/null
+++ b/private/omp2mex.m
@@ -0,0 +1,23 @@
+%This is the Matlab interface to the OMP2 MEX implementation.
+%The function is not for independent use, only through omp2.m.
+
+
+%OMP2MEX Matlab interface to the OMP2 MEX implementation.
+%  GAMMA = OMP2MEX(D,X,DtX,XtX,G,EPSILON,SPARSE_G,MSGDELTA,MAXATOMS,PROFILE)
+%  invokes the OMP2 MEX function according to the specified parameters. Not
+%  all the parameters are required. Those among D, X, DtX, XtX and G which
+%  are not specified should be passed as [].
+%
+%  EPSILON - the target error.
+%  SPARSE_G - returns a sparse GAMMA when nonzero, full GAMMA when zero.
+%  MSGDELTA - the delay in secs between messages. Zero means no messages.
+%  MAXATOMS - the max number of atoms per signal, negative for no max.
+%  PROFILE - nonzero means that profiling information should be printed.
+
+
+%  Ron Rubinstein
+%  Computer Science Department
+%  Technion, Haifa 32000 Israel
+%  ronrubin@cs
+%
+%  April 2009
diff --git a/private/omp2mex.mexa64 b/private/omp2mex.mexa64
new file mode 100755
index 0000000000000000000000000000000000000000..3f22bdf31f0489e6653f3a89ca6e09c3b0cf082b
Binary files /dev/null and b/private/omp2mex.mexa64 differ
diff --git a/private/omp2mex.mexmaci64 b/private/omp2mex.mexmaci64
new file mode 100755
index 0000000000000000000000000000000000000000..e9138de400661c3d7c753a2891f45c45fe477ce6
Binary files /dev/null and b/private/omp2mex.mexmaci64 differ
diff --git a/private/omp2mex.mexw64 b/private/omp2mex.mexw64
new file mode 100755
index 0000000000000000000000000000000000000000..7fddcd587bf09dc698484f9be745622a4d07769b
Binary files /dev/null and b/private/omp2mex.mexw64 differ
diff --git a/private/ompcore.c b/private/ompcore.c
new file mode 100755
index 0000000000000000000000000000000000000000..b4fe24ab2361bd503be58ebab1205c398bd10b50
--- /dev/null
+++ b/private/ompcore.c
@@ -0,0 +1,409 @@
+/**************************************************************************
+ *
+ * File name: ompcore.c
+ *
+ * Ron Rubinstein
+ * Computer Science Department
+ * Technion, Haifa 32000 Israel
+ * ronrubin@cs
+ *
+ * Last Updated: 25.8.2009
+ *
+ *************************************************************************/
+
+
+#include "ompcore.h"
+#include "omputils.h"
+#include "ompprof.h"
+#include "myblas.h"
+#include <math.h>
+#include <string.h>
+
+
+
+/******************************************************************************
+ *                                                                            *
+ *                           Batch-OMP Implementation                         *
+ *                                                                            *
+ ******************************************************************************/  
+
+mxArray* ompcore(double D[], double x[], double DtX[], double XtX[], double G[], mwSize n, mwSize m, mwSize L,
+                 int T, double eps, int gamma_mode, int profile, double msg_delta, int erroromp)
+{
+  
+  profdata pd;
+  mxArray *Gamma;
+  mwIndex i, j, signum, pos, *ind, *gammaIr, *gammaJc, gamma_count;
+  mwSize allocated_coefs, allocated_cols;
+  int DtX_specified, XtX_specified, batchomp, standardomp, *selected_atoms;
+  double *alpha, *r, *Lchol, *c, *Gsub, *Dsub, sum, *gammaPr, *tempvec1, *tempvec2; 
+  double eps2, resnorm, delta, deltaprev, secs_remain;
+  int mins_remain, hrs_remain;
+  clock_t lastprint_time, starttime;
+ 
+  
+  
+  /*** status flags ***/
+  
+  DtX_specified = (DtX!=0);   /* indicates whether D'*x was provided */
+  XtX_specified = (XtX!=0);   /* indicates whether sum(x.*x) was provided */
+  
+  standardomp = (G==0);       /* batch-omp or standard omp are selected depending on availability of G */
+  batchomp = !standardomp;
+  
+  
+  
+  /*** allocate output matrix ***/
+  
+  
+  if (gamma_mode == FULL_GAMMA) {
+    
+    /* allocate full matrix of size m X L */
+    
+    Gamma = mxCreateDoubleMatrix(m, L, mxREAL);
+    gammaPr = mxGetPr(Gamma);
+    gammaIr = 0;
+    gammaJc = 0;
+  }
+  else {
+    
+    /* allocate sparse matrix with room for allocated_coefs nonzeros */
+    
+    /* for error-omp, begin with L*sqrt(n)/2 allocated nonzeros, otherwise allocate L*T nonzeros */
+    allocated_coefs = erroromp ? (mwSize)(ceil(L*sqrt((double)n)/2.0) + 1.01) : L*T;
+    Gamma = mxCreateSparse(m, L, allocated_coefs, mxREAL);
+    gammaPr = mxGetPr(Gamma);
+    gammaIr = mxGetIr(Gamma);
+    gammaJc = mxGetJc(Gamma);
+    gamma_count = 0;
+    gammaJc[0] = 0;
+  }
+  
+  
+  /*** helper arrays ***/
+  
+  alpha = (double*)mxMalloc(m*sizeof(double));        /* contains D'*residual */
+  ind = (mwIndex*)mxMalloc(n*sizeof(mwIndex));        /* indices of selected atoms */
+  selected_atoms = (int*)mxMalloc(m*sizeof(int));     /* binary array with 1's for selected atoms */
+  c = (double*)mxMalloc(n*sizeof(double));            /* orthogonal projection result */
+  
+  /* current number of columns in Dsub / Gsub / Lchol */
+  allocated_cols = erroromp ? (mwSize)(ceil(sqrt((double)n)/2.0) + 1.01) : T;
+  
+  /* Cholesky decomposition of D_I'*D_I */
+  Lchol = (double*)mxMalloc(n*allocated_cols*sizeof(double));
+
+  /* temporary vectors for various computations */
+  tempvec1 = (double*)mxMalloc(m*sizeof(double));
+  tempvec2 = (double*)mxMalloc(m*sizeof(double));
+  
+  if (batchomp) {
+    /* matrix containing G(:,ind) - the columns of G corresponding to the selected atoms, in order of selection */
+    Gsub = (double*)mxMalloc(m*allocated_cols*sizeof(double));
+  }
+  else {
+    /* matrix containing D(:,ind) - the selected atoms from D, in order of selection */
+    Dsub = (double*)mxMalloc(n*allocated_cols*sizeof(double));
+    
+    /* stores the residual */
+    r = (double*)mxMalloc(n*sizeof(double));        
+  }
+  
+  if (!DtX_specified) {
+    /* contains D'*x for the current signal */
+    DtX = (double*)mxMalloc(m*sizeof(double));  
+  }
+  
+  
+  
+  /*** initializations for error omp ***/
+  
+  if (erroromp) {
+    eps2 = eps*eps;        /* compute eps^2 */
+    if (T<0 || T>n) {      /* unspecified max atom num - set max atoms to n */
+      T = n;
+    }
+  }
+  
+  
+  
+  /*** initialize timers ***/
+  
+  initprofdata(&pd);             /* initialize profiling counters */
+  starttime = clock();           /* record starting time for eta computations */
+  lastprint_time = starttime;    /* time of last status display */
+  
+  
+  
+  /**********************   perform omp for each signal   **********************/
+  
+  
+  
+  for (signum=0; signum<L; ++signum) {
+    
+    
+    /* initialize residual norm and deltaprev for error-omp */
+    
+    if (erroromp) {
+      if (XtX_specified) {
+        resnorm = XtX[signum];
+      }
+      else {
+        resnorm = dotprod(x+n*signum, x+n*signum, n);
+        addproftime(&pd, XtX_TIME);
+      }
+      deltaprev = 0;     /* delta tracks the value of gamma'*G*gamma */
+    }
+    else {
+      /* ignore residual norm stopping criterion */
+      eps2 = 0;
+      resnorm = 1;
+    }
+    
+    
+    if (resnorm>eps2 && T>0) {
+      
+      /* compute DtX */
+      
+      if (!DtX_specified) {
+        matT_vec(1, D, x+n*signum, DtX, n, m);
+        addproftime(&pd, DtX_TIME);
+      }
+      
+      
+      /* initialize alpha := DtX */
+      
+      memcpy(alpha, DtX + m*signum*DtX_specified, m*sizeof(double));
+      
+      
+      /* mark all atoms as unselected */
+      
+      for (i=0; i<m; ++i) {
+        selected_atoms[i] = 0;
+      }
+      
+    }
+    
+
+    /* main loop */
+    
+    i=0;
+    while (resnorm>eps2 && i<T) {
+
+      /* index of next atom */
+      
+      pos = maxabs(alpha, m);
+      addproftime(&pd, MAXABS_TIME);
+      
+      
+      /* stop criterion: selected same atom twice, or inner product too small */
+      
+      if (selected_atoms[pos] || alpha[pos]*alpha[pos]<1e-14) {
+        break;
+      }
+      
+      
+      /* mark selected atom */
+      
+      ind[i] = pos;
+      selected_atoms[pos] = 1;
+      
+      
+      /* matrix reallocation */
+      
+      if (erroromp && i>=allocated_cols) {
+        
+        allocated_cols = (mwSize)(ceil(allocated_cols*MAT_INC_FACTOR) + 1.01);
+        
+        Lchol = (double*)mxRealloc(Lchol,n*allocated_cols*sizeof(double));
+        
+        batchomp ? (Gsub = (double*)mxRealloc(Gsub,m*allocated_cols*sizeof(double))) :
+                   (Dsub = (double*)mxRealloc(Dsub,n*allocated_cols*sizeof(double))) ;
+      }
+      
+      
+      /* append column to Gsub or Dsub */
+      
+      if (batchomp) {
+        memcpy(Gsub+i*m, G+pos*m, m*sizeof(double));
+      }
+      else {
+        memcpy(Dsub+i*n, D+pos*n, n*sizeof(double));
+      }
+      
+      
+      /*** Cholesky update ***/
+      
+      if (i==0) {
+        *Lchol = 1;
+      }
+      else {
+        
+        /* incremental Cholesky decomposition: compute next row of Lchol */
+        
+        if (standardomp) {
+          matT_vec(1, Dsub, D+n*pos, tempvec1, n, i);      /* compute tempvec1 := Dsub'*d where d is new atom */
+          addproftime(&pd, DtD_TIME);
+        }
+        else {
+          vec_assign(tempvec1, Gsub+i*m, ind, i);          /* extract tempvec1 := Gsub(ind,i) */
+        }
+        backsubst('L', Lchol, tempvec1, tempvec2, n, i);   /* compute tempvec2 = Lchol \ tempvec1 */
+        for (j=0; j<i; ++j) {                              /* write tempvec2 to end of Lchol */
+          Lchol[j*n+i] = tempvec2[j];
+        }
+        
+        /* compute Lchol(i,i) */
+        sum = 0;
+        for (j=0; j<i; ++j) {         /* compute sum of squares of last row without Lchol(i,i) */
+          sum += SQR(Lchol[j*n+i]);
+        }
+        if ( (1-sum) <= 1e-14 ) {     /* Lchol(i,i) is zero => selected atoms are dependent */
+          break;
+        }
+        Lchol[i*n+i] = sqrt(1-sum);
+      }
+      
+      addproftime(&pd, LCHOL_TIME);
+
+      i++;
+      
+      
+      /* perform orthogonal projection and compute sparse coefficients */
+      
+      vec_assign(tempvec1, DtX + m*signum*DtX_specified, ind, i);   /* extract tempvec1 = DtX(ind) */
+      cholsolve('L', Lchol, tempvec1, c, n, i);                     /* solve LL'c = tempvec1 for c */
+      addproftime(&pd, COMPCOEF_TIME);
+      
+
+      /* update alpha = D'*residual */
+      
+      if (standardomp) {
+        mat_vec(-1, Dsub, c, r, n, i);             /* compute r := -Dsub*c */
+        vec_sum(1, x+n*signum, r, n);              /* compute r := x+r */
+        
+        
+        /*memcpy(r, x+n*signum, n*sizeof(double));   /* assign r := x */
+        /*mat_vec1(-1, Dsub, c, 1, r, n, i);         /* compute r := r-Dsub*c */
+        
+        addproftime(&pd, COMPRES_TIME);
+        matT_vec(1, D, r, alpha, n, m);            /* compute alpha := D'*r */
+        addproftime(&pd, DtR_TIME);
+        
+        /* update residual norm */
+        if (erroromp) {
+          resnorm = dotprod(r, r, n);
+          addproftime(&pd, UPDATE_RESNORM_TIME);
+        }
+      }
+      else {
+        mat_vec(1, Gsub, c, tempvec1, m, i);                              /* compute tempvec1 := Gsub*c */
+        memcpy(alpha, DtX + m*signum*DtX_specified, m*sizeof(double));    /* set alpha = D'*x */
+        vec_sum(-1, tempvec1, alpha, m);                                  /* compute alpha := alpha - tempvec1 */
+        addproftime(&pd, UPDATE_DtR_TIME);
+        
+        /* update residual norm */
+        if (erroromp) {
+          vec_assign(tempvec2, tempvec1, ind, i);      /* assign tempvec2 := tempvec1(ind) */
+          delta = dotprod(c,tempvec2,i);               /* compute c'*tempvec2 */
+          resnorm = resnorm - delta + deltaprev;       /* residual norm update */
+          deltaprev = delta;
+          addproftime(&pd, UPDATE_RESNORM_TIME);
+        }
+      }
+    }
+    
+    
+    /*** generate output vector gamma ***/
+
+    if (gamma_mode == FULL_GAMMA) {    /* write the coefs in c to their correct positions in gamma */
+      for (j=0; j<i; ++j) {
+        gammaPr[m*signum + ind[j]] = c[j];
+      }
+    }
+    else {
+      /* sort the coefs by index before writing them to gamma */
+      quicksort(ind,c,i);
+      addproftime(&pd, INDEXSORT_TIME);
+      
+      /* gamma is full - reallocate */
+      if (gamma_count+i >= allocated_coefs) {
+        
+        while(gamma_count+i >= allocated_coefs) {
+          allocated_coefs = (mwSize)(ceil(GAMMA_INC_FACTOR*allocated_coefs) + 1.01);
+        }
+        
+        mxSetNzmax(Gamma, allocated_coefs);
+        mxSetPr(Gamma, mxRealloc(gammaPr, allocated_coefs*sizeof(double)));
+        mxSetIr(Gamma, mxRealloc(gammaIr, allocated_coefs*sizeof(mwIndex)));
+        
+        gammaPr = mxGetPr(Gamma);
+        gammaIr = mxGetIr(Gamma);
+      }
+      
+      /* append coefs to gamma and update the indices */
+      for (j=0; j<i; ++j) {
+        gammaPr[gamma_count] = c[j];
+        gammaIr[gamma_count] = ind[j];
+        gamma_count++;
+      }
+      gammaJc[signum+1] = gammaJc[signum] + i;
+    }
+    
+    
+    
+    /*** display status messages ***/
+    
+    if (msg_delta>0 && (clock()-lastprint_time)/(double)CLOCKS_PER_SEC >= msg_delta)
+    {
+      lastprint_time = clock();
+      
+      /* estimated remainig time */
+      secs2hms( ((L-signum-1)/(double)(signum+1)) * ((lastprint_time-starttime)/(double)CLOCKS_PER_SEC) ,
+        &hrs_remain, &mins_remain, &secs_remain);
+      
+      mexPrintf("omp: signal %d / %d, estimated remaining time: %02d:%02d:%05.2f\n",        
+        signum+1, L, hrs_remain, mins_remain, secs_remain);
+      mexEvalString("drawnow;");
+    }
+    
+  }
+  
+  /* end omp */
+  
+  
+  
+  /*** print final messages ***/
+  
+  if (msg_delta>0) {
+    mexPrintf("omp: signal %d / %d\n", signum, L);
+  }
+  
+  if (profile) {
+    printprofinfo(&pd, erroromp, batchomp, L);
+  }
+  
+  
+  
+  /* free memory */
+  
+  if (!DtX_specified) {
+    mxFree(DtX);
+  }
+  if (standardomp) {
+    mxFree(r);
+    mxFree(Dsub);
+  }
+  else {
+    mxFree(Gsub);
+  }  
+  mxFree(tempvec2);
+  mxFree(tempvec1);
+  mxFree(Lchol);
+  mxFree(c);
+  mxFree(selected_atoms);
+  mxFree(ind);
+  mxFree(alpha);
+  
+  return Gamma;
+}
diff --git a/private/ompcore.h b/private/ompcore.h
new file mode 100755
index 0000000000000000000000000000000000000000..6fb03a81880ff69bf1d35aa98df4611471331eb3
--- /dev/null
+++ b/private/ompcore.h
@@ -0,0 +1,80 @@
+/**************************************************************************
+ *
+ * File name: ompcore.h
+ *
+ * Ron Rubinstein
+ * Computer Science Department
+ * Technion, Haifa 32000 Israel
+ * ronrubin@cs
+ *
+ * Last Updated: 18.8.2009
+ *
+ * Contains the core implementation of Batch-OMP / OMP-Cholesky.
+ *
+ *************************************************************************/
+
+
+#ifndef __OMP_CORE_H__
+#define __OMP_CORE_H__
+
+
+#include "mex.h"
+
+
+
+/**************************************************************************
+ * Perform Batch-OMP or OMP-Cholesky on a specified set of signals, using
+ * either a fixed number of atoms or an error bound.
+ *
+ * Parameters (not all required):
+ *
+ *   D - the dictionary, of size n X m
+ *   x - the signals, of size n X L
+ *   DtX - D'*x, of size m X L
+ *   XtX - squared norms of the signals in x, sum(x.*x), of length L
+ *   G - D'*D, of size m X m
+ *   T - target sparsity, or maximal number of atoms for error-based OMP
+ *   eps - target residual norm for error-based OMP
+ *   gamma_mode - one of the constants FULL_GAMMA or SPARSE_GAMMA
+ *   profile - if non-zero, profiling info is printed
+ *   msg_delta - positive: the # of seconds between status prints, otherwise: nothing is printed
+ *   erroromp - if nonzero indicates error-based OMP, otherwise fixed sparsity OMP
+ *
+ * Usage:
+ *
+ *   The function can be called using different parameters, and will have
+ *   different complexity depending on the parameters specified. Arrays which
+ *   are not specified should be passed as null (0). When G is specified, 
+ *   Batch-OMP is performed. Otherwise, OMP-Cholesky is performed.
+ *
+ *   Fixed-sparsity usage:
+ *   ---------------------
+ *   Either DtX, or D and x, must be specified. Specifying DtX is more efficient.
+ *   XtX does not need to be specified.
+ *   When D and x are specified, G is not required. However, not providing G
+ *   will significantly degrade efficiency.
+ *   The number of atoms must be specified in T. The value of eps is ignored.
+ *   Finally, set erroromp to 0.
+ *
+ *   Error-OMP usage:
+ *   ----------------
+ *   Either DtX and Xtx, or D and x, must be specified. Specifying DtX and XtX
+ *   is more efficient.
+ *   When D and x are specified, G is not required. However, not providing G
+ *   will significantly degrade efficiency.
+ *   The target error must be specified in eps. A hard limit on the number
+ *   of atoms can also be specified via the parameter T. Otherwise, T should 
+ *   be negative. Finally, set erroromp to nonzero.
+ *
+ *
+ * Returns: 
+ *   An mxArray containing the sparse representations of the signals in x
+ *   (allocated using the appropriate mxCreateXXX() function).
+ *   The array is either full or sparse, depending on gamma_mode.
+ *
+ **************************************************************************/
+mxArray* ompcore(double D[], double x[], double DtX[], double XtX[], double G[], mwSize n, mwSize m, mwSize L,
+                 int T, double eps, int gamma_mode, int profile, double msg_delta, int erroromp);
+
+
+#endif
diff --git a/private/ompmex.c b/private/ompmex.c
new file mode 100755
index 0000000000000000000000000000000000000000..d8d3e9fa0248f4afc88e17811c41ea97a8b12929
--- /dev/null
+++ b/private/ompmex.c
@@ -0,0 +1,133 @@
+/**************************************************************************
+ *
+ * File name: ompmex.c
+ *
+ * Ron Rubinstein
+ * Computer Science Department
+ * Technion, Haifa 32000 Israel
+ * ronrubin@cs
+ *
+ * Last Updated: 18.8.2009
+ *
+ *************************************************************************/
+
+#include "ompcore.h"
+#include "omputils.h"
+#include "mexutils.h"
+
+
+/* Input Arguments */
+
+#define IN_D          prhs[0]
+#define IN_X          prhs[1]
+#define IN_DtX        prhs[2]
+#define IN_G          prhs[3]
+#define IN_T          prhs[4]
+#define IN_SPARSE_G   prhs[5]
+#define IN_MSGDELTA   prhs[6]
+#define IN_PROFILE    prhs[7]
+
+
+/* Output Arguments */
+
+#define	GAMMA_OUT     plhs[0]
+
+
+/***************************************************************************************/
+
+
+void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray*prhs[])
+
+{
+  double *D, *x, *DtX, *G, msgdelta;
+  int gmode, profile, T;
+  mwSize m, n, L;   /* D is n x m , X is n x L, DtX is m x L */
+  
+  
+  /* check parameters */
+  
+  checkmatrix(IN_D, "OMP", "D");
+  checkmatrix(IN_X, "OMP", "X");
+  checkmatrix(IN_DtX, "OMP", "DtX");
+  checkmatrix(IN_G, "OMP", "G");
+  
+  checkscalar(IN_T, "OMP", "T");
+  checkscalar(IN_SPARSE_G, "OMP", "sparse_g");
+  checkscalar(IN_MSGDELTA, "OMP", "msgdelta");
+  checkscalar(IN_PROFILE, "OMP", "profile");
+
+  
+  /* get parameters */
+  
+  x = D = DtX = G = 0;
+  
+  if (!mxIsEmpty(IN_D))
+    D = mxGetPr(IN_D);
+  
+  if (!mxIsEmpty(IN_X))
+    x = mxGetPr(IN_X);
+  
+  if (!mxIsEmpty(IN_DtX))
+    DtX = mxGetPr(IN_DtX);
+  
+  if (!mxIsEmpty(IN_G))
+    G = mxGetPr(IN_G);
+  
+  T = (int)(mxGetScalar(IN_T)+1e-2);
+  if ((int)(mxGetScalar(IN_SPARSE_G)+1e-2)) {
+    gmode = SPARSE_GAMMA;
+  }
+  else {
+    gmode = FULL_GAMMA;
+  }
+  msgdelta = mxGetScalar(IN_MSGDELTA);
+  profile = (int)(mxGetScalar(IN_PROFILE)+1e-2);
+  
+  
+  /* check sizes */
+  
+  if (D && x) {
+    n = mxGetM(IN_D);
+    m = mxGetN(IN_D);
+    L = mxGetN(IN_X);
+    
+    if (mxGetM(IN_X) != n) {
+      mexErrMsgTxt("D and X have incompatible sizes.");
+    }
+    
+    if (G) {
+      if (mxGetN(IN_G)!=mxGetM(IN_G)) {
+        mexErrMsgTxt("G must be a square matrix.");
+      }
+      if (mxGetN(IN_G) != m) {
+        mexErrMsgTxt("D and G have incompatible sizes.");
+      }
+    }
+  }
+  
+  else if (DtX) {
+    m = mxGetM(IN_DtX);
+    L = mxGetN(IN_DtX);
+    
+    n = T;  /* arbitrary - it is enough to assume signal length is T */
+    
+    if (mxGetN(IN_G)!=mxGetM(IN_G)) {
+      mexErrMsgTxt("G must be a square matrix.");
+    }
+    if (mxGetN(IN_G) != m) {
+      mexErrMsgTxt("DtX and G have incompatible sizes.");
+    }
+  }
+  
+  else {
+    mexErrMsgTxt("Either D and X, or DtX, must be specified.");
+  }
+  
+  
+  /* Do OMP! */
+  
+  GAMMA_OUT = ompcore(D, x, DtX, 0, G, n, m, L, T, 0, gmode, profile, msgdelta, 0);
+  
+  return;
+}
+
diff --git a/private/ompmex.m b/private/ompmex.m
new file mode 100755
index 0000000000000000000000000000000000000000..6026dd0881db5d1a547425b860e3289c64e179c5
--- /dev/null
+++ b/private/ompmex.m
@@ -0,0 +1,22 @@
+%This is the Matlab interface to the OMP MEX implementation.
+%The function is not for independent use, only through omp.m.
+
+
+%OMPMEX Matlab interface to the OMP MEX implementation.
+%  GAMMA = OMPMEX(D,X,DtX,G,L,SPARSE_G,MSGDELTA,PROFILE) invokes the OMP
+%  MEX function according to the specified parameters. Not all the
+%  parameters are required. Those among D, X, DtX and G which are not
+%  specified should be passed as [].
+%
+%  L - the target sparsity.
+%  SPARSE_G - returns a sparse GAMMA when nonzero, full GAMMA when zero.
+%  MSGDELTA - the delay in secs between messages. Zero means no messages.
+%  PROFILE - nonzero means that profiling information should be printed.
+
+
+%  Ron Rubinstein
+%  Computer Science Department
+%  Technion, Haifa 32000 Israel
+%  ronrubin@cs
+%
+%  April 2009
diff --git a/private/ompmex.mexa64 b/private/ompmex.mexa64
new file mode 100755
index 0000000000000000000000000000000000000000..339fad972d15682fe0edc34de0f06874bc5150cf
Binary files /dev/null and b/private/ompmex.mexa64 differ
diff --git a/private/ompmex.mexmaci64 b/private/ompmex.mexmaci64
new file mode 100755
index 0000000000000000000000000000000000000000..814378bd61ee09123ac26873a040bcb642ea8a8a
Binary files /dev/null and b/private/ompmex.mexmaci64 differ
diff --git a/private/ompmex.mexw64 b/private/ompmex.mexw64
new file mode 100755
index 0000000000000000000000000000000000000000..3547dfaeffc53253a09ca1ef8a1d2f5e38a088b5
Binary files /dev/null and b/private/ompmex.mexw64 differ
diff --git a/private/ompprof.c b/private/ompprof.c
new file mode 100755
index 0000000000000000000000000000000000000000..a83fabd4292b96066d1bcee82f3e0668d6dcaa73
--- /dev/null
+++ b/private/ompprof.c
@@ -0,0 +1,113 @@
+/**************************************************************************
+ *
+ * File name: ompprof.c
+ *
+ * Ron Rubinstein
+ * Computer Science Department
+ * Technion, Haifa 32000 Israel
+ * ronrubin@cs
+ *
+ * Last Updated: 11.4.2009
+ *
+ *************************************************************************/
+
+
+#include "ompprof.h"
+
+
+/* initialize profiling information */
+
+void initprofdata(profdata *pd)
+{
+  pd->DtX_time = 0;
+  pd->XtX_time = 0;
+  pd->DtR_time = 0;
+  pd->maxabs_time = 0;
+  pd->DtD_time = 0;
+  pd->Lchol_time = 0;
+  pd->compcoef_time = 0;
+  pd->update_DtR_time = 0;
+  pd->update_resnorm_time = 0;
+  pd->compres_time = 0;
+  pd->indexsort_time = 0;
+  
+  pd->DtX_time_counted = 0;
+  pd->XtX_time_counted = 0;
+  pd->DtR_time_counted = 0;
+  pd->DtD_time_counted = 0;
+  pd->update_DtR_time_counted = 0;
+  pd->resnorm_time_counted = 0;
+  pd->compres_time_counted = 0;
+  pd->indexsort_time_counted = 0;
+  
+  pd->prevtime = clock();
+}
+
+
+/* add elapsed time to profiling data according to specified computation */
+
+void addproftime(profdata *pd, int comptype)
+{
+  switch(comptype) {
+    case DtX_TIME:            pd->DtX_time            += clock()-pd->prevtime; pd->DtX_time_counted = 1; break;
+    case XtX_TIME:            pd->XtX_time            += clock()-pd->prevtime; pd->XtX_time_counted = 1; break;
+    case DtR_TIME:            pd->DtR_time            += clock()-pd->prevtime; pd->DtR_time_counted = 1; break;
+    case DtD_TIME:            pd->DtD_time            += clock()-pd->prevtime; pd->DtD_time_counted = 1; break;
+    case COMPRES_TIME:        pd->compres_time        += clock()-pd->prevtime; pd->compres_time_counted = 1; break;
+    case UPDATE_DtR_TIME:     pd->update_DtR_time     += clock()-pd->prevtime; pd->update_DtR_time_counted = 1; break;
+    case UPDATE_RESNORM_TIME: pd->update_resnorm_time += clock()-pd->prevtime; pd->resnorm_time_counted = 1; break;
+    case INDEXSORT_TIME:      pd->indexsort_time      += clock()-pd->prevtime; pd->indexsort_time_counted = 1; break;
+    case MAXABS_TIME:         pd->maxabs_time         += clock()-pd->prevtime; break;
+    case LCHOL_TIME:          pd->Lchol_time          += clock()-pd->prevtime; break;
+    case COMPCOEF_TIME:       pd->compcoef_time       += clock()-pd->prevtime; break;
+  }
+  pd->prevtime = clock();
+}
+
+
+/* print profiling info */
+
+void printprofinfo(profdata *pd, int erroromp, int batchomp, int signum)
+{
+  clock_t tottime;
+  
+  tottime = pd->DtX_time + pd->XtX_time + pd->DtR_time + pd->DtD_time + pd->compres_time + pd->maxabs_time + 
+            pd->Lchol_time + pd->compcoef_time + pd->update_DtR_time + pd->update_resnorm_time + pd->indexsort_time;
+  
+  mexPrintf("\n\n*****  Profiling information for %s  *****\n\n", erroromp? "OMP2" : "OMP");
+  
+  mexPrintf("OMP mode: %s\n\n", batchomp? "Batch-OMP" : "OMP-Cholesky");
+  
+  mexPrintf("Total signals processed: %d\n\n", signum);
+  
+  if (pd->DtX_time_counted) {
+    mexPrintf("Compute DtX time:      %7.3lf seconds\n", pd->DtX_time/(double)CLOCKS_PER_SEC);
+  }
+  if (pd->XtX_time_counted) {
+    mexPrintf("Compute XtX time:      %7.3lf seconds\n", pd->XtX_time/(double)CLOCKS_PER_SEC);
+  }
+  mexPrintf("Max abs time:          %7.3lf seconds\n", pd->maxabs_time/(double)CLOCKS_PER_SEC);
+  if (pd->DtD_time_counted) {
+    mexPrintf("Compute DtD time:      %7.3lf seconds\n", pd->DtD_time/(double)CLOCKS_PER_SEC);
+  }
+  mexPrintf("Lchol update time:     %7.3lf seconds\n", pd->Lchol_time/(double)CLOCKS_PER_SEC);
+  mexPrintf("Compute coef time:     %7.3lf seconds\n", pd->compcoef_time/(double)CLOCKS_PER_SEC);
+  if (pd->compres_time_counted) {
+    mexPrintf("Compute R time:        %7.3lf seconds\n", pd->compres_time/(double)CLOCKS_PER_SEC);
+  }
+  if (pd->DtR_time_counted) {
+    mexPrintf("Compute DtR time:      %7.3lf seconds\n", pd->DtR_time/(double)CLOCKS_PER_SEC);
+  }
+  if (pd->update_DtR_time_counted) {
+    mexPrintf("Update DtR time:       %7.3lf seconds\n", pd->update_DtR_time/(double)CLOCKS_PER_SEC);
+  }
+  if (pd->resnorm_time_counted) {
+    mexPrintf("Update resnorm time:   %7.3lf seconds\n", pd->update_resnorm_time/(double)CLOCKS_PER_SEC);
+  }
+  if (pd->indexsort_time_counted) {
+    mexPrintf("Index sort time:       %7.3lf seconds\n", pd->indexsort_time/(double)CLOCKS_PER_SEC);
+  }
+  mexPrintf("---------------------------------------\n");
+  mexPrintf("Total time:            %7.3lf seconds\n\n", tottime/(double)CLOCKS_PER_SEC);
+}
+
diff --git a/private/ompprof.h b/private/ompprof.h
new file mode 100755
index 0000000000000000000000000000000000000000..e71dbe1d625248a0f09e0d40e66ceb172ce12f3a
--- /dev/null
+++ b/private/ompprof.h
@@ -0,0 +1,106 @@
+/**************************************************************************
+ *
+ * File name: ompprof.h
+ *
+ * Ron Rubinstein
+ * Computer Science Department
+ * Technion, Haifa 32000 Israel
+ * ronrubin@cs
+ *
+ * Last Updated: 18.8.2009
+ *
+ * Collection of definitions and functions for profiling the OMP method.
+ *
+ *************************************************************************/
+
+
+#ifndef __OMP_PROF_H__
+#define __OMP_PROF_H__
+
+#include "mex.h"
+#include <time.h>
+
+
+
+/**************************************************************************
+ *
+ * Constants and data types.
+ *
+ **************************************************************************/
+
+
+/* constants denoting the various parts of the algorithm */
+
+enum { DtX_TIME, XtX_TIME, DtR_TIME, MAXABS_TIME, DtD_TIME, LCHOL_TIME, COMPCOEF_TIME, 
+       UPDATE_DtR_TIME, UPDATE_RESNORM_TIME, COMPRES_TIME, INDEXSORT_TIME };
+
+       
+       
+/* profiling data container with counters for each part of the algorithm */
+       
+typedef struct profdata 
+{
+  clock_t prevtime;  /* the time when last initialization/call to addproftime() was performed */
+  
+  clock_t DtX_time;
+  clock_t XtX_time;
+  clock_t DtR_time;
+  clock_t maxabs_time;
+  clock_t DtD_time;
+  clock_t Lchol_time;
+  clock_t compcoef_time;
+  clock_t update_DtR_time;
+  clock_t update_resnorm_time;
+  clock_t compres_time;
+  clock_t indexsort_time;
+  
+  /* flags indicating whether profiling data was gathered */
+  int DtX_time_counted;
+  int XtX_time_counted;
+  int DtR_time_counted;
+  int DtD_time_counted;
+  int update_DtR_time_counted;
+  int resnorm_time_counted;
+  int compres_time_counted;
+  int indexsort_time_counted;
+  
+} profdata;
+
+
+
+/**************************************************************************
+ *
+ * Initialize a profdata structure, zero all counters, and start its timer.
+ *
+ **************************************************************************/
+void initprofdata(profdata *pd);
+
+
+/**************************************************************************
+ *
+ * Add elapsed time from last call to addproftime(), or from initialization
+ * of profdata, to the counter specified by comptype. comptype must be one
+ * of the constants in the enumeration above.
+ *
+ **************************************************************************/
+void addproftime(profdata *pd, int comptype);
+
+
+/**************************************************************************
+ *
+ * Print the current contents of the counters in profdata.
+ *
+ * Parameters:
+ *   pd - the profdata to print
+ *   erroromp - indicates whether error-based (nonzero) or sparsity-based (zero)
+ *              omp was performed.
+ *   batchomp - indicates whether batch-omp (nonzero) or omp-cholesky (zero)
+ *              omp was performed.
+ *   signum   - number of signals processed by omp
+ *
+ **************************************************************************/
+void printprofinfo(profdata *pd, int erroromp, int batchomp, int signum);
+
+
+#endif
+
diff --git a/private/omputils.c b/private/omputils.c
new file mode 100755
index 0000000000000000000000000000000000000000..d70ffe0ef2813e0dde5e6a2da428b4be3b3e66bd
--- /dev/null
+++ b/private/omputils.c
@@ -0,0 +1,89 @@
+/**************************************************************************
+ *
+ * File name: omputils.c
+ *
+ * Ron Rubinstein
+ * Computer Science Department
+ * Technion, Haifa 32000 Israel
+ * ronrubin@cs
+ *
+ * Last Updated: 18.8.2009
+ *
+ *************************************************************************/
+
+#include "omputils.h"
+#include <math.h>
+
+
+const char FULL_GAMMA_STR[] = "full";
+const char SPARSE_GAMMA_STR[] = "sparse";
+
+
+/* convert seconds to hours, minutes and seconds */
+
+void secs2hms(double sectot, int *hrs, int *mins, double *secs)
+{
+  *hrs = (int)(floor(sectot/3600)+1e-2);
+  sectot = sectot - 3600*(*hrs);
+  *mins = (int)(floor(sectot/60)+1e-2);
+  *secs = sectot - 60*(*mins);
+}
+
+
+/* quicksort, public-domain C implementation by Darel Rex Finley. */
+/* modification: sorts the array data[] as well, according to the values in the array vals[] */
+
+#define  MAX_LEVELS  300
+
+void quicksort(mwIndex vals[], double data[], mwIndex n) {
+  
+  long piv, beg[MAX_LEVELS], end[MAX_LEVELS], i=0, L, R, swap ;
+  double datapiv;
+  
+  beg[0]=0;
+  end[0]=n;
+  
+  while (i>=0) {
+    
+    L=beg[i]; 
+    R=end[i]-1;
+    
+    if (L<R) {
+      
+      piv=vals[L];
+      datapiv=data[L];
+      
+      while (L<R) 
+      {
+        while (vals[R]>=piv && L<R) 
+          R--;
+        if (L<R) {
+          vals[L]=vals[R];
+          data[L++]=data[R];
+        }
+        
+        while (vals[L]<=piv && L<R) 
+          L++;
+        if (L<R) { 
+          vals[R]=vals[L];
+          data[R--]=data[L];
+        }
+      }
+      
+      vals[L]=piv;
+      data[L]=datapiv;
+      
+      beg[i+1]=L+1;
+      end[i+1]=end[i];
+      end[i++]=L;
+      
+      if (end[i]-beg[i] > end[i-1]-beg[i-1]) {
+        swap=beg[i]; beg[i]=beg[i-1]; beg[i-1]=swap;
+        swap=end[i]; end[i]=end[i-1]; end[i-1]=swap;
+      }
+    }
+    else {
+      i--;
+    }
+  }
+}
diff --git a/private/omputils.h b/private/omputils.h
new file mode 100755
index 0000000000000000000000000000000000000000..32839cdaef38dde7bf9f87721163b7d403072144
--- /dev/null
+++ b/private/omputils.h
@@ -0,0 +1,77 @@
+/**************************************************************************
+ *
+ * File name: omputils.h
+ *
+ * Ron Rubinstein
+ * Computer Science Department
+ * Technion, Haifa 32000 Israel
+ * ronrubin@cs
+ *
+ * Last Updated: 18.8.2009
+ *
+ * Utility definitions and functions for the OMP library.
+ *
+ *************************************************************************/
+
+
+#ifndef __OMP_UTILS_H__
+#define __OMP_UTILS_H__
+
+#include "mex.h"
+
+
+/* constants for the representation mode of gamma */
+
+extern const char FULL_GAMMA_STR[];      /* "full" */
+extern const char SPARSE_GAMMA_STR[];    /* "sparse" */
+
+
+#define FULL_GAMMA 1
+#define SPARSE_GAMMA 2
+#define INVALID_MODE 3
+
+
+
+/**************************************************************************
+ * Memory management for OMP2.
+ *
+ * GAMMA_INC_FACTOR:
+ * The matrix GAMMA is allocated with sqrt(n)/2 coefficients per signal,
+ * for a total of nzmax = L*sqrt(n)/2 nonzeros. Whenever GAMMA needs to be
+ * increased, it is increased by a factor of GAMMA_INC_FACTOR.
+ *
+ * MAT_INC_FACTOR:
+ * The matrices Lchol, Gsub and Dsub are allocated with sqrt(n)/2
+ * columns each. If additional columns are needed, this number is 
+ * increased by a factor of MAT_INC_FACTOR.
+ **************************************************************************/
+
+#define GAMMA_INC_FACTOR (1.4)
+#define MAT_INC_FACTOR (1.6)
+
+
+
+/**************************************************************************
+ * Convert number of seconds to hour, minute and second representation.
+ *
+ * Parameters:
+ *   sectot - total number of seconds
+ *   hrs, mins, secs - output hours (whole) and minutes (whole) and seconds
+ *
+ **************************************************************************/
+void secs2hms(double sectot, int *hrs, int *mins, double *secs);
+
+
+
+/**************************************************************************
+ * QuickSort - public-domain C implementation by Darel Rex Finley.
+ *
+ * Modified to sort both the array vals[] and the array data[] according 
+ * to the values in the array vals[].
+ *
+ **************************************************************************/
+void quicksort(mwIndex vals[], double data[], mwIndex n);
+
+
+#endif
+
diff --git a/private/printf.m b/private/printf.m
new file mode 100755
index 0000000000000000000000000000000000000000..402ac26f2595f2ab7db4bcb98ec86cde4adb1073
--- /dev/null
+++ b/private/printf.m
@@ -0,0 +1,26 @@
+function str = printf(varargin)
+%PRINTF Print formatted text to screen.
+%  PRINTF(FMT,VAL1,VAL2,...) formats the data in VAL1,VAL2,... according to
+%  the format string FMT, and prints the result to the screen.
+%
+%  The call to PRINTF(FMT,VAL1,VAL2,...) simply invokes the call
+%  DISP(SPRINTF(FMT,VAL1,VAL2,...)). For a complete description of the
+%  format string options see function SPRINTF.
+%
+%  STR = PRINTF(...) also returns the formatted string.
+
+
+%  Ron Rubinstein
+%  Computer Science Department
+%  Technion, Haifa 32000 Israel
+%  ronrubin@cs
+%
+%  April 2008
+
+
+if (nargout>0)
+  str = sprintf(varargin{:});
+  disp(str);
+else
+  disp(sprintf(varargin{:}));
+end