initial push

2022-05-23 18:49:15 +02:00 · 2022-05-23 18:49:15 +02:00 · ce067dbbf2
commit ce067dbbf2
101 changed files with 3885 additions and 0 deletions
--- a/data/000.txt
+++ b/data/000.txt
@ -0,0 +1,5 @@
+<frame >
+        
+        <titlepage>
+        
+        </frame>
--- a/data/001Problem.txt
+++ b/data/001Problem.txt
@ -0,0 +1,14 @@
+<frame title="Problem">
+        <list>
+        <e>Paper with Benedikt</e>
+<e>require multiple very specific datasets</e>
+<l2st>
+<e>many but not to many features</e>
+<e>at least some samples (for the NN)</e>
+<e>Only numerical attributes best</e>
+<e>specific quality</e>
+<e>unrelated datasets</e>
+</l2st>
+<e>Requires you to search for many datasets and filter them</e>
+        </list>
+        </frame>
--- a/data/002Students.txt
+++ b/data/002Students.txt
@ -0,0 +1,9 @@
+<frame title="Students">
+        <list>
+        <e>Not clear what you can use</e>
+<e>Many different formats</e>
+<e>train/test splits</e>
+<e>So for Students I just do this work and send them archives directly</e>
+<e>->Not a good solution</e>
+        </list>
+        </frame>
--- a/data/003yano.txt
+++ b/data/003yano.txt
@ -0,0 +1,12 @@
+<frame title="yano">
+        <list>
+        <e>So I have been packaging all my scripts</e>
+<e>I had surprisingly much fun doing this</e>
+<l2st>
+<e>More than just standard functions</e>
+<e>A couple of weird decisions</e>
+<e>And this will likely grow further</e>
+</l2st>
+<e>->So I would like to discuss some parts with you and maybe you even have more features you might want</e>
+        </list>
+        </frame>
--- a/data/004yano.txt
+++ b/data/004yano.txt
@ -0,0 +1,17 @@
+<frame title="yano">
+    <split>
+    <que>
+    <list>
+    <e>Simply install it over pip</e>
+<e>Contains 187 real-World Datasets</e>
+<e>->biggest library of datasets explicitely for anomaly detection</e>
+<e>not yet happy with this</e>
+<e>especially only mostly contains numerical and nominal attributes</e>
+<e>->few categorical and no time-series attributes</e>
+    </list>
+    </que>
+    <que>
+    <i f="../prep/04yano/a.png" wmode="True"></i>
+    </que>
+    </split>
+    </frame>
--- a/data/005selector.txt
+++ b/data/005selector.txt
@ -0,0 +1,17 @@
+<section Basics>
+<frame title="selector">
+        
+        <code>
+import yano
+from yano.symbols import *
+condition= (number_of_features>5) &
+           (number_of_features<100) &
+           (number_of_samples>100) &
+           (number_of_samples<10000) &
+           (number_of_samples>2*number_of_features) &
+           ~index
+print(len(condition), "Datasets found")
+</code>
+->33 Datasets found
+        
+        </frame>
--- a/data/006selectors.txt
+++ b/data/006selectors.txt
@ -0,0 +1,29 @@
+<frame title="selectors">
+        <list>
+        <e>Lots of symbols like this</e>
+<l2st>
+<e>name</e>
+<e>number\_of\_features</e>
+<e>number\_of\_samples</e>
+<e>index (correlated datasets)</e>
+</l2st>
+<e>Feature types</e>
+<l2st>
+<e>numeric</e>
+<e>nominal</e>
+<e>categorical</e>
+<e>(textual)</e>
+</l2st>
+<e>Count based</e>
+<l2st>
+<e>number\_anomalies</e>
+<e>number\_normals</e>
+<e>fraction\_anomalies</e>
+</l2st>
+<e>Specific ones</e>
+<l2st>
+<e>image\_based</e>
+<e>(linearly\_seperable)</e>
+</l2st>
+        </list>
+        </frame>
--- a/data/007iterating.txt
+++ b/data/007iterating.txt
@ -0,0 +1,15 @@
+<frame title="iterating">
+        
+        <code>
+for dataset in condition:
+    print(condition)
+</code>
+<l2st>
+<e>\[annthyroid\]</e>
+<e>\[breastw\]</e>
+<e>\[cardio\]</e>
+<e>\[...\]</e>
+<e>\[Housing\_low\]</e>
+</l2st>
+        
+        </frame>
--- a/data/008iterating.txt
+++ b/data/008iterating.txt
@ -0,0 +1,9 @@
+<frame title="iterating">
+        
+        <code>
+for dataset in condition:
+    x=dataset.getx()
+    y=dataset.gety()
+</code>
+        
+        </frame>
--- a/data/009pipeline.txt
+++ b/data/009pipeline.txt
@ -0,0 +1,12 @@
+<frame title="pipeline">
+        
+        <code>
+from yano.iter import *
+for dataset, x,tx,ty in pipeline(condition, 
+                                 split,
+                                 shuffle,
+                                 normalize("minmax")):
+    ...
+</code>
+        
+        </frame>
--- a/data/010pipeline.txt
+++ b/data/010pipeline.txt
@ -0,0 +1,16 @@
+<frame title="pipeline">
+        <list>
+        <e>Again there are a couple modifiers possible</e>
+<l2st>
+<e>nonconst->remove constant features</e>
+<e>shuffle</e>
+<e>normalize('zscore'/'minmax')</e>
+<e>cut(10)->at most 10 datasets</e>
+<e>split->train test split, all anomalies in test set</e>
+<e>crossval(5)->similar to split, but do multiple times (crossvalidation)</e>
+</l2st>
+<e>modifiers interact with each other</e>
+<e>For example: normalize('minmax'), split</e>
+<e>->train set always below 1, but no guarantees for the test set</e>
+        </list>
+        </frame>
--- a/data/011CrossValidation.txt
+++ b/data/011CrossValidation.txt
@ -0,0 +1,18 @@
+<frame title="CrossValidation">
+        <list>
+        <e>Learned from DMC: Crossvalidation is important</e>
+<e>Rarely found in Anomaly Detection, why?</e>
+<e>A bit more complicated (not all samples are equal), but no reason why not</e>
+<e>->So I implemented it into yano</e>
+<l2st>
+<e>folding only on normal data</e>
+<e>How to handle anomalies?</e>
+<e>If not folding them, cross-validation less useful</e>
+<e>if folding them, often rare anomalies even more rare</e>
+<e>->test set always 50\% anomalous</e>
+<e>->Also improves simple evaluation metrics (accuracy)</e>
+</l2st>
+<e>Do you know a reason why Cross Validation is not common in AD?</e>
+<e>Are there Problems with the way I fold my Anomalies?</e>
+        </list>
+        </frame>
--- a/data/012Logging.txt
+++ b/data/012Logging.txt
@ -0,0 +1,17 @@
+<frame title="Logging">
+        
+        <code>
+from yano.logging import Logger
+from pyod.models.iforest import IForest
+from extended_iforest import train_extended_ifor
+l=Logger({"IFor":IForest(n_estimators=100),
+          "eIFor":train_extended_ifor})
+for dataset, folds in pipeline(condition,
+                               crossval(5),
+                               normalize("minmax"),
+                               shuffle):
+    l.run_cross(dataset, folds)
+latex=l.to_latex()
+</code>
+        
+        </frame>
--- a/data/013Seeding.txt
+++ b/data/013Seeding.txt
@ -0,0 +1,9 @@
+<frame title="Seeding">
+        <list>
+        <e>If you dont do anything, everything is seeded.</e>
+<e>Makes rerunning a Model until the performance is good quite obvious.</e>
+<e>But as every Run is seeded itself, this might induce bias.</e>
+<e>Do you think this is worth it?</e>
+<e>Are there any Problems with this?</e>
+        </list>
+        </frame>
--- a/data/014.txt
+++ b/data/014.txt
@ -0,0 +1,24 @@
+<frame >
+        
+        \begin{tabular}{lll}
+\hline
+ Dataset                   & eIFor                         & IFor                          \\
+\hline
+ $pc3$                     & $\textbf{0.7231}  \pm 0.0153$ & $\textbf{0.7223}  \pm 0.0178$ \\
+ $pima$                    & $\textbf{0.7405}  \pm 0.0110$ & $\textbf{0.7347}  \pm 0.0126$ \\
+ $Diabetes\_present$       & $\textbf{0.7414}  \pm 0.0195$ & $\textbf{0.7344}  \pm 0.0242$ \\
+ $waveform-5000$           & $\textbf{0.7687}  \pm 0.0123$ & $\textbf{0.7592}  \pm 0.0206$ \\
+ $vowels$                  & $\textbf{0.7843}  \pm 0.0298$ & $\textbf{0.7753}  \pm 0.0334$ \\
+ $Vowel\_0$                & $\textbf{0.8425}  \pm 0.0698$ & $0.7193 \pm 0.0817$           \\
+ $Abalone\_1\_8$           & $\textbf{0.8525}  \pm 0.0263$ & $0.8452 \pm 0.0257$           \\
+ $annthyroid$              & $0.8399 \pm 0.0135$           & $\textbf{0.9087}  \pm 0.0090$ \\
+ $Vehicle\_van$            & $\textbf{0.8792}  \pm 0.0265$ & $\textbf{0.8697}  \pm 0.0383$ \\
+ $ionosphere$              & $\textbf{0.9320}  \pm 0.0069$ & $0.9086 \pm 0.0142$           \\
+ $breastw$                 & $\textbf{0.9948}  \pm 0.0031$ & $\textbf{0.9952}  \pm 0.0033$ \\
+ $segment$                 & $\textbf{1.0}$                & $\textbf{0.9993}  \pm 0.0015$ \\
+ $$                        & $$                            & $$                            \\
+ $Average$                 & $\textbf{0.8005}$ & $\textbf{0.7957}$ \\
+\hline
+\end{tabular}
+        
+        </frame>
--- a/data/015statistics.txt
+++ b/data/015statistics.txt
@ -0,0 +1,10 @@
+<frame title="statistics">
+        <list>
+        <e>Friedman test to see if there is a difference between models</e>
+<e>Nemenyi test to see which models are equal, mark those equal to the maximum</e>
+<e>For 2 models, Friedman not defined -> use Wilcoxon test</e>
+<e>Does this match your expectation from the table?</e>
+<e>Two models are 'equal' if their probability of being from the same distribution is #LessThan(p_b,p)#, what value should #Eq(p_b,0.1)# have?</e>
+<e>Do I need to correct for p hacking (n experiments, so increase the difficulty for each, or is that clear from the table)</e>
+        </list>
+        </frame>
--- a/data/016Extended
+++ b/data/016Extended
@ -0,0 +1,24 @@
+<section Experiments 1>
+<repeat w="['ifor','eifor','qual']">
+<frame title="Extended Isolation Forests">
+<split>
+<que>
+        <list>
+        <e>Isolation Forests are one algorithm for AD</e>
+<e>Tries to isolate abnormal (rare) points instead of modelling normal ones</e>
+<e>Creative approach->fairly successful (3000 Citations)</e>
+<e>Many follow up papers</e>
+<e>Extended Isolation Forest (Hariri et. al. 2018, 140 Citations)</e>
+<e>Remove bias from the Isolation Forests</e>
+<e>Also claim to improve their anomaly detection quality</e>
+        </list>
+</que>
+
+<que>
+<i f="???" wmode="True"></i>
+</que>
+</split>
+
+</frame>
+
+</repeat>
--- a/data/017.txt
+++ b/data/017.txt
@ -0,0 +1,22 @@
+<frame >
+        
+        \begin{tabular}{lll}
+\hline
+ Dataset                   & eIFor                         & IFor                          \\
+\hline
+ $Delft\_pump\_5x3\_noisy$ & $\textbf{0.3893}  \pm 0.0345$ & $\textbf{0.4272}  \pm 0.0680$ \\
+ $vertebral$               & $\textbf{0.4260}  \pm 0.0111$ & $\textbf{0.4554}  \pm 0.0416$ \\
+ $Liver\_1$                & $0.5367 \pm 0.0508$           & $\textbf{0.5474}  \pm 0.0541$ \\
+ $Sonar\_mines$            & $\textbf{0.6882}  \pm 0.1264$ & $0.6189 \pm 0.1301$           \\
+ $letter$                  & $\textbf{0.6756}  \pm 0.0119$ & $0.6471 \pm 0.0111$           \\
+ $Glass\_building\_float$  & $\textbf{0.6480}  \pm 0.1012$ & $\textbf{0.6755}  \pm 0.1117$ \\
+ $pc3$                     & $\textbf{0.7231}  \pm 0.0153$ & $\textbf{0.7223}  \pm 0.0178$ \\
+ $pima$                    & $\textbf{0.7405}  \pm 0.0110$ & $\textbf{0.7347}  \pm 0.0126$ \\
+ $Diabetes\_present$       & $\textbf{0.7414}  \pm 0.0195$ & $\textbf{0.7344}  \pm 0.0242$ \\
+ $waveform-5000$           & $\textbf{0.7687}  \pm 0.0123$ & $\textbf{0.7592}  \pm 0.0206$ \\
+ $steel-plates-fault$      & $\textbf{0.7735}  \pm 0.0351$ & $\textbf{0.7682}  \pm 0.0402$ \\
+ $vowels$                  & $\textbf{0.7843}  \pm 0.0298$ & $\textbf{0.7753}  \pm 0.0334$ \\
+\hline
+\end{tabular}
+        
+        </frame>
--- a/data/018.txt
+++ b/data/018.txt
@ -0,0 +1,22 @@
+<frame >
+        
+        \begin{tabular}{lll}
+\hline
+ Dataset                   & eIFor                         & IFor                          \\
+\hline
+ $Vowel\_0$                & $\textbf{0.8425}  \pm 0.0698$ & $0.7193 \pm 0.0817$           \\
+ $Housing\_low$            & $\textbf{0.7807}  \pm 0.0333$ & $\textbf{0.7862}  \pm 0.0336$ \\
+ $ozone-level-8hr$         & $\textbf{0.7904}  \pm 0.0207$ & $\textbf{0.7768}  \pm 0.0118$ \\
+ $Spectf\_0$               & $\textbf{0.8155}  \pm 0.0255$ & $0.7535 \pm 0.0239$           \\
+ $HeartC$                  & $0.7795 \pm 0.0258$           & $\textbf{0.8079}  \pm 0.0255$ \\
+ $satellite$               & $\textbf{0.8125}  \pm 0.0170$ & $\textbf{0.8103}  \pm 0.0061$ \\
+ $optdigits$               & $\textbf{0.8099}  \pm 0.0310$ & $\textbf{0.8142}  \pm 0.0267$ \\
+ $spambase$                & $\textbf{0.8085}  \pm 0.0110$ & $\textbf{0.8202}  \pm 0.0042$ \\
+ $Abalone\_1\_8$           & $\textbf{0.8525}  \pm 0.0263$ & $0.8452 \pm 0.0257$           \\
+ $qsar-biodeg$             & $\textbf{0.8584}  \pm 0.0119$ & $\textbf{0.8628}  \pm 0.0135$ \\
+ $annthyroid$              & $0.8399 \pm 0.0135$           & $\textbf{0.9087}  \pm 0.0090$ \\
+ $Vehicle\_van$            & $\textbf{0.8792}  \pm 0.0265$ & $\textbf{0.8697}  \pm 0.0383$ \\
+\hline
+\end{tabular}
+        
+        </frame>
--- a/data/019.txt
+++ b/data/019.txt
@ -0,0 +1,21 @@
+<frame >
+        
+        \begin{tabular}{lll}
+\hline
+ Dataset                   & eIFor                         & IFor                          \\
+\hline
+ $ionosphere$              & $\textbf{0.9320}  \pm 0.0069$ & $0.9086 \pm 0.0142$           \\
+ $page-blocks$             & $0.9189 \pm 0.0061$           & $\textbf{0.9299}  \pm 0.0016$ \\
+ $Ecoli$                   & $\textbf{0.9418}  \pm 0.0292$ & $0.9192 \pm 0.0332$           \\
+ $cardio$                  & $\textbf{0.9564}  \pm 0.0043$ & $\textbf{0.9535}  \pm 0.0036$ \\
+ $wbc$                     & $\textbf{0.9611}  \pm 0.0121$ & $\textbf{0.9607}  \pm 0.0107$ \\
+ $pendigits$               & $\textbf{0.9641}  \pm 0.0097$ & $\textbf{0.9652}  \pm 0.0076$ \\
+ $thyroid$                 & $0.9818 \pm 0.0024$           & $\textbf{0.9871}  \pm 0.0025$ \\
+ $breastw$                 & $\textbf{0.9948}  \pm 0.0031$ & $\textbf{0.9952}  \pm 0.0033$ \\
+ $segment$                 & $\textbf{1.0}$                & $\textbf{0.9993}  \pm 0.0015$ \\
+ $$                        & $$                            & $$                            \\
+ $Average$                 & $\textbf{0.8005}  \pm 0.1458$ & $\textbf{0.7957}  \pm 0.1431$ \\
+\hline
+\end{tabular}
+        
+        </frame>
--- a/data/020highdim.txt
+++ b/data/020highdim.txt
@ -0,0 +1,4 @@
+<section Experiments 2>
+<frame title="highdim">
+<i f="../prep/19highdim/a.png" wmode="True"></i>
+</frame>
--- a/Condition.txt
+++ b/Condition.txt
@ -0,0 +1,13 @@
+<frame title="New Condition">
+        
+        <code>
+condition= (number_of_samples>200) &
+           (number_of_samples<10000) &
+           (number_of_features>50) &
+           (number_of_features<500) &
+           ~index
+print(len(condition),"Datasets found")
+</code>
+->13 Datasets found
+        
+        </frame>
--- a/data/022New
+++ b/data/022New
@ -0,0 +1,12 @@
+<frame title="New Models">
+        
+        <code>
+from pyod.models.iforest import IForest
+from pyod.models.knn import KNN
+from pyod.models.lof import LOF
+l=Logger({"IFor":Iforest(n_estimators=100),
+          "Lof":LOF(),
+          "Knn": KNN()}, addfeat=True)
+</code>
+        
+        </frame>
--- a/data/023.txt
+++ b/data/023.txt
@ -0,0 +1,25 @@
+<frame >
+        
+        \begin{tabular}{llll}
+\hline
+ Dataset                       & Knn                           & Lof                           & IFor                          \\
+\hline
+ $Delft\_pump\_5x3\_noisy(64)$ & $0.3800 \pm 0.0475$           & $0.3462 \pm 0.0327$           & $\textbf{0.4272}  \pm 0.0680$ \\
+ $hill-valley(100)$            & $0.4744 \pm 0.0269$           & $\textbf{0.5060}  \pm 0.0327$ & $0.4720 \pm 0.0288$           \\
+ $speech(400)$                 & $0.4903 \pm 0.0103$           & $\textbf{0.5104}  \pm 0.0115$ & $0.4872 \pm 0.0184$           \\
+ $Sonar\_mines(60)$            & $\textbf{0.7284}  \pm 0.0939$ & $0.6769 \pm 0.0933$           & $0.6189 \pm 0.1301$           \\
+ $ozone-level-8hr(72)$         & $\textbf{0.8051}  \pm 0.0288$ & $0.7738 \pm 0.0292$           & $\textbf{0.7768}  \pm 0.0118$ \\
+ $spambase(57)$                & $0.8038 \pm 0.0125$           & $0.7712 \pm 0.0055$           & $\textbf{0.8202}  \pm 0.0042$ \\
+ $arrhythmia(274)$             & $\textbf{0.8137}  \pm 0.0185$ & $0.8042 \pm 0.0186$           & $\textbf{0.8086}  \pm 0.0099$ \\
+ $mnist(100)$                  & $0.9345 \pm 0.0039$           & $\textbf{0.9548}  \pm 0.0037$ & $0.8732 \pm 0.0069$           \\
+ $Concordia3\_32(256)$         & $0.9246 \pm 0.0107$           & $\textbf{0.9486}  \pm 0.0099$ & $\textbf{0.9322}  \pm 0.0178$ \\
+ $optdigits(64)$               & $0.9966 \pm 0.0012$           & $\textbf{0.9975}  \pm 0.0012$ & $0.8142 \pm 0.0267$           \\
+ $gas-drift(128)$              & $\textbf{0.9790}  \pm 0.0018$ & $0.9585 \pm 0.0055$           & $0.8764 \pm 0.0166$           \\
+ $Delft\_pump\_AR(160)$        & $\textbf{0.9965}$             & $\textbf{0.9953}  \pm 0.0019$ & $0.9665 \pm 0.0096$           \\
+ $musk(166)$                   & $\textbf{1.0}$                & $\textbf{1.0}$                & $0.9808 \pm 0.0117$           \\
+ $$                            & $$                            & $$                            & $$                            \\
+ $Average$                     & $\textbf{0.7944}$ & $\textbf{0.7879}$ & $0.7580$           \\
+\hline
+\end{tabular}
+        
+        </frame>
--- a/data/024.txt
+++ b/data/024.txt
@ -0,0 +1,11 @@
+<frame >
+        
+        <l2st>
+<e>Hypothesis: Isolation Forests are better when there are numerical and nominal attributes</e>
+<e>Easy to test</e>
+</l2st>
+<code>
+condition=condition & (numeric & nominal)
+</code>
+        
+        </frame>
--- a/data/025.txt
+++ b/data/025.txt
@ -0,0 +1,20 @@
+<frame >
+        
+        \begin{tabular}{llll}
+\hline
+ Dataset               & Knn                           & IFor                          & Lof                           \\
+\hline
+ $ozone-level-8hr(72)$ & $\textbf{0.8051}  \pm 0.0288$ & $\textbf{0.7768}  \pm 0.0118$ & $0.7738 \pm 0.0292$           \\
+ $spambase(57)$        & $0.8038 \pm 0.0125$           & $\textbf{0.8202}  \pm 0.0042$ & $0.7712 \pm 0.0055$           \\
+ $arrhythmia(274)$     & $\textbf{0.8137}  \pm 0.0185$ & $\textbf{0.8086}  \pm 0.0099$ & $0.8042 \pm 0.0186$           \\
+ $musk(166)$           & $\textbf{1.0}$                & $0.9808 \pm 0.0117$           & $\textbf{1.0}$                \\
+ $$                    & $$                            & $$                            & $$                            \\
+ $Average$             & $\textbf{0.8556}$ & $\textbf{0.8466}$ & $\textbf{0.8373}$ \\
+\hline
+\end{tabular}
+<l2st>
+<e>Only 4 datasets, so not clear at all</e>
+<e>->More datasets</e>
+</l2st>
+        
+        </frame>
--- a/data/026Unsupervised
+++ b/data/026Unsupervised
@ -0,0 +1,12 @@
+<section Experiments 3>
+<frame title="Unsupervised Optimization">
+        <list>
+        <e>There are analysis that are only possible with many datasets</e>
+<e>Here: unsupervised optimization</e>
+<e>Given multiple AD models, find which is best:</e>
+<e>Use AUC score? Requires Anomalies->Overfitting</e>
+<e>Can you find an unsupervised Method?</e>
+<e>In general very complicated, so here only focus on very small differences in the model.</e>
+<e>So each model is an autoencoder, trained on the same dataset, where the difference is only in the initialisation</e>
+        </list>
+        </frame>
--- a/Optimization.txt
+++ b/Optimization.txt
@ -0,0 +1,20 @@
+<repeat w="['page-blocks','pima']">
+<frame title="Loss Optimization">
+<split>
+<que>
+        <list>
+        <e>First guess Loss of the Model on the training Data</e>
+<e>How to evaluate this?</e>
+<e>Train many models, look at the average AUC score.</e>
+<e>For the alternative, take groups of 20 models, and look at the AUC score of the best model.</e>
+<e>Is there a meaningfull difference between results? Give result as z\_score (#(m_1-m_2)/sqrt(s_1**2+s_2**2)#)</e>
+<e>This difference depends a lot on the dataset</e>
+<e>->even #LessThan(30,z)# does not mean much</e>
+        </list>
+        </que>
+        <que>
+        <i f="histone_???" wmode="True"></i>
+        </que>
+        </split>
+        </frame>
+</repeat>
--- a/data/028loss.txt
+++ b/data/028loss.txt
@ -0,0 +1,12 @@
+<frame title="loss">
+    <split>
+    <que>
+    <list>
+    <e>Pick the Model with the lowest l2\-loss</e>
+    </list>
+    </que>
+    <que>
+    <i f="../prep/27loss/z_loss.pdf" wmode="True"></i>
+    </que>
+    </split>
+    </frame>
--- a/data/029Robustness.txt
+++ b/data/029Robustness.txt
@ -0,0 +1,14 @@
+<frame title="Robustness">
+    <split>
+    <que>
+    <list>
+    <e>Pick points with 1\% width difference in input space around each point.</e>
+<e>for each point, find the maximum difference in output space.</e>
+<e>average this difference</e>
+    </list>
+    </que>
+    <que>
+    <i f="../prep/28Robustness/z_robu.pdf" wmode="True"></i>
+    </que>
+    </split>
+    </frame>
--- a/data/030Distance
+++ b/data/030Distance
@ -0,0 +1,14 @@
+<frame title="Distance Correlation">
+    <split>
+    <que>
+    <list>
+    <e>Pick random points in the input space.</e>
+<e>measure the distance in input and output space</e>
+<e>a low correlation is a good model</e>
+    </list>
+    </que>
+    <que>
+    <i f="../prep/29Distance_Correlation/z_dist.pdf" wmode="True"></i>
+    </que>
+    </split>
+    </frame>
--- a/data/031Other.txt
+++ b/data/031Other.txt
@ -0,0 +1,12 @@
+<section Conclusion>
+<frame title="Other">
+        <list>
+        <e>Things I still want to add:</e>
+<l2st>
+<e>Ensemble Methods</e>
+<e>Visualisation options</e>
+<e>Alternative Evaluations</e>
+<e>Hyperparameter optimisation (with crossvalidation)</e>
+</l2st>
+        </list>
+        </frame>
--- a/data/032Feedback.txt
+++ b/data/032Feedback.txt
@ -0,0 +1,7 @@
+<frame title="Feedback">
+        <list>
+        <e>What do you think about this?</e>
+<e>Is there something I should also add?</e>
+<e>What would you need for you to actually use this?</e>
+        </list>
+        </frame>
--- a/general.txt
+++ b/general.txt
@ -0,0 +1,12 @@
+<plt>
+
+<name Current experiment status>
+<title pip install yano>
+<stitle yano>
+
+<institute ls9 tu Dortmund>
+
+<theme CambridgeUS>
+<colo dolphin>
+
+</plt>
--- a/1
+++ b/1
@ -0,0 +1 @@
+Subproject commit 62ffd6ae589d7983791feea9d44d7658534d54a0
--- a/out/compile.bat
+++ b/out/compile.bat
@ -0,0 +1,3 @@
+pdflatex main.tex
+pdflatex main.tex
+
--- a/out/compile.sh
+++ b/out/compile.sh
@ -0,0 +1,3 @@
+pdflatex main.tex
+pdflatex main.tex
+
--- a/out/label.json
+++ b/out/label.json
@ -0,0 +1,127 @@
+[
+  {
+    "typ": "img",
+    "files": [
+      "../prep/04yano/a.png"
+    ],
+    "label": "prep04yanoapng",
+    "caption": "",
+    "where": "../yano//data/004yano.txt"
+  },
+  {
+    "typ": "section",
+    "title": "Basics",
+    "label": "Basics",
+    "file": "../yano//data/005selector.txt",
+    "issec": true
+  },
+  {
+    "typ": "section",
+    "title": "Experiments 1",
+    "label": "Experiments 1",
+    "file": "../yano//data/016Extended Isolation Forests.txt",
+    "issec": true
+  },
+  {
+    "typ": "img",
+    "files": [
+      "../imgs/ifor"
+    ],
+    "label": "ifor",
+    "caption": "",
+    "where": "../yano//data/016Extended Isolation Forests.txt"
+  },
+  {
+    "typ": "img",
+    "files": [
+      "../imgs/eifor"
+    ],
+    "label": "eifor",
+    "caption": "",
+    "where": "../yano//data/016Extended Isolation Forests.txt"
+  },
+  {
+    "typ": "img",
+    "files": [
+      "../imgs/qual"
+    ],
+    "label": "qual",
+    "caption": "",
+    "where": "../yano//data/016Extended Isolation Forests.txt"
+  },
+  {
+    "typ": "section",
+    "title": "Experiments 2",
+    "label": "Experiments 2",
+    "file": "../yano//data/020highdim.txt",
+    "issec": true
+  },
+  {
+    "typ": "img",
+    "files": [
+      "../prep/19highdim/a.png"
+    ],
+    "label": "prep19highdimapng",
+    "caption": "",
+    "where": "../yano//data/020highdim.txt"
+  },
+  {
+    "typ": "section",
+    "title": "Experiments 3",
+    "label": "Experiments 3",
+    "file": "../yano//data/026Unsupervised Optimization.txt",
+    "issec": true
+  },
+  {
+    "typ": "img",
+    "files": [
+      "../imgs/histone_page-blocks"
+    ],
+    "label": "histone_page-blocks",
+    "caption": "",
+    "where": "../yano//data/027Loss Optimization.txt"
+  },
+  {
+    "typ": "img",
+    "files": [
+      "../imgs/histone_pima"
+    ],
+    "label": "histone_pima",
+    "caption": "",
+    "where": "../yano//data/027Loss Optimization.txt"
+  },
+  {
+    "typ": "img",
+    "files": [
+      "../prep/27loss/z_loss.pdf"
+    ],
+    "label": "prep27lossz_losspdf",
+    "caption": "",
+    "where": "../yano//data/028loss.txt"
+  },
+  {
+    "typ": "img",
+    "files": [
+      "../prep/28Robustness/z_robu.pdf"
+    ],
+    "label": "prep28Robustnessz_robupdf",
+    "caption": "",
+    "where": "../yano//data/029Robustness.txt"
+  },
+  {
+    "typ": "img",
+    "files": [
+      "../prep/29Distance_Correlation/z_dist.pdf"
+    ],
+    "label": "prep29Distance_Correlationz_distpdf",
+    "caption": "",
+    "where": "../yano//data/030Distance Correlation.txt"
+  },
+  {
+    "typ": "section",
+    "title": "Conclusion",
+    "label": "Conclusion",
+    "file": "../yano//data/031Other.txt",
+    "issec": true
+  }
+]
--- a/out/main.aux
+++ b/out/main.aux
@ -0,0 +1,258 @@
+\relax 
+\providecommand\hyper@newdestlabel[2]{}
+\providecommand\HyperFirstAtBeginDocument{\AtBeginDocument}
+\HyperFirstAtBeginDocument{\ifx\hyper@anchor\@undefined
+\global\let\oldcontentsline\contentsline
+\gdef\contentsline#1#2#3#4{\oldcontentsline{#1}{#2}{#3}}
+\global\let\oldnewlabel\newlabel
+\gdef\newlabel#1#2{\newlabelxx{#1}#2}
+\gdef\newlabelxx#1#2#3#4#5#6{\oldnewlabel{#1}{{#2}{#3}}}
+\AtEndDocument{\ifx\hyper@anchor\@undefined
+\let\contentsline\oldcontentsline
+\let\newlabel\oldnewlabel
+\fi}
+\fi}
+\global\let\hyper@last\relax 
+\gdef\HyperFirstAtBeginDocument#1{#1}
+\providecommand\HyField@AuxAddToFields[1]{}
+\providecommand\HyField@AuxAddToCoFields[2]{}
+\@writefile{nav}{\headcommand {\slideentry {0}{0}{1}{1/1}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {1}{1}}}
+\newlabel{Problem<1>}{{2}{2}{}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Problem<1>}{2}}
+\newlabel{Problem}{{2}{2}{}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Problem}{2}}
+\@writefile{nav}{\headcommand {\slideentry {0}{0}{2}{2/2}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {2}{2}}}
+\newlabel{Students<1>}{{3}{3}{}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Students<1>}{3}}
+\newlabel{Students}{{3}{3}{}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Students}{3}}
+\@writefile{nav}{\headcommand {\slideentry {0}{0}{3}{3/3}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {3}{3}}}
+\newlabel{yano<1>}{{4}{4}{}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {yano<1>}{4}}
+\newlabel{yano}{{4}{4}{}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {yano}{4}}
+\@writefile{nav}{\headcommand {\slideentry {0}{0}{4}{4/4}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {4}{4}}}
+\newlabel{yano<1>}{{5}{5}{}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {yano<1>}{5}}
+\newlabel{yano}{{5}{5}{}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {yano}{5}}
+\newlabel{fig:prep04yanoapng}{{5}{5}{}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {fig:prep04yanoapng}{5}}
+\@writefile{nav}{\headcommand {\slideentry {0}{0}{5}{5/5}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {5}{5}}}
+\@writefile{toc}{\beamer@sectionintoc {1}{Basics}{6}{0}{1}}
+\@writefile{nav}{\headcommand {\beamer@sectionpages {1}{5}}}
+\@writefile{nav}{\headcommand {\beamer@subsectionpages {1}{5}}}
+\@writefile{nav}{\headcommand {\sectionentry {1}{Basics}{6}{Basics}{0}}}
+\newlabel{sec:Basics}{{1}{6}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {sec:Basics}{6}}
+\newlabel{selector<1>}{{6}{6}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {selector<1>}{6}}
+\newlabel{selector}{{6}{6}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {selector}{6}}
+\@writefile{nav}{\headcommand {\slideentry {1}{0}{1}{6/6}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {6}{6}}}
+\newlabel{selectors<1>}{{7}{7}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {selectors<1>}{7}}
+\newlabel{selectors}{{7}{7}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {selectors}{7}}
+\@writefile{nav}{\headcommand {\slideentry {1}{0}{2}{7/7}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {7}{7}}}
+\newlabel{iterating<1>}{{8}{8}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {iterating<1>}{8}}
+\newlabel{iterating}{{8}{8}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {iterating}{8}}
+\@writefile{nav}{\headcommand {\slideentry {1}{0}{3}{8/8}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {8}{8}}}
+\newlabel{iterating<1>}{{9}{9}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {iterating<1>}{9}}
+\newlabel{iterating}{{9}{9}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {iterating}{9}}
+\@writefile{nav}{\headcommand {\slideentry {1}{0}{4}{9/9}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {9}{9}}}
+\newlabel{pipeline<1>}{{10}{10}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {pipeline<1>}{10}}
+\newlabel{pipeline}{{10}{10}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {pipeline}{10}}
+\@writefile{nav}{\headcommand {\slideentry {1}{0}{5}{10/10}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {10}{10}}}
+\newlabel{pipeline<1>}{{11}{11}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {pipeline<1>}{11}}
+\newlabel{pipeline}{{11}{11}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {pipeline}{11}}
+\@writefile{nav}{\headcommand {\slideentry {1}{0}{6}{11/11}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {11}{11}}}
+\newlabel{CrossValidation<1>}{{12}{12}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {CrossValidation<1>}{12}}
+\newlabel{CrossValidation}{{12}{12}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {CrossValidation}{12}}
+\@writefile{nav}{\headcommand {\slideentry {1}{0}{7}{12/12}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {12}{12}}}
+\newlabel{Logging<1>}{{13}{13}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Logging<1>}{13}}
+\newlabel{Logging}{{13}{13}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Logging}{13}}
+\@writefile{nav}{\headcommand {\slideentry {1}{0}{8}{13/13}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {13}{13}}}
+\newlabel{Seeding<1>}{{14}{14}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Seeding<1>}{14}}
+\newlabel{Seeding}{{14}{14}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Seeding}{14}}
+\@writefile{nav}{\headcommand {\slideentry {1}{0}{9}{14/14}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {14}{14}}}
+\@writefile{nav}{\headcommand {\slideentry {1}{0}{10}{15/15}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {15}{15}}}
+\newlabel{statistics<1>}{{16}{16}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {statistics<1>}{16}}
+\newlabel{statistics}{{16}{16}{Basics}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {statistics}{16}}
+\@writefile{nav}{\headcommand {\slideentry {1}{0}{11}{16/16}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {16}{16}}}
+\@writefile{toc}{\beamer@sectionintoc {2}{Experiments 1}{17}{0}{2}}
+\@writefile{nav}{\headcommand {\beamer@sectionpages {6}{16}}}
+\@writefile{nav}{\headcommand {\beamer@subsectionpages {6}{16}}}
+\@writefile{nav}{\headcommand {\sectionentry {2}{Experiments 1}{17}{Experiments 1}{0}}}
+\newlabel{sec:Experiments 1}{{2}{17}{Experiments 1}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {sec:Experiments 1}{17}}
+\newlabel{Extended Isolation Forests<1>}{{17}{17}{Experiments 1}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Extended Isolation Forests<1>}{17}}
+\newlabel{Extended Isolation Forests}{{17}{17}{Experiments 1}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Extended Isolation Forests}{17}}
+\newlabel{fig:ifor}{{17}{17}{Experiments 1}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {fig:ifor}{17}}
+\@writefile{nav}{\headcommand {\slideentry {2}{0}{1}{17/17}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {17}{17}}}
+\newlabel{Extended Isolation Forests<1>}{{18}{18}{Experiments 1}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Extended Isolation Forests<1>}{18}}
+\newlabel{Extended Isolation Forests}{{18}{18}{Experiments 1}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Extended Isolation Forests}{18}}
+\newlabel{fig:eifor}{{18}{18}{Experiments 1}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {fig:eifor}{18}}
+\@writefile{nav}{\headcommand {\slideentry {2}{0}{2}{18/18}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {18}{18}}}
+\newlabel{Extended Isolation Forests<1>}{{19}{19}{Experiments 1}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Extended Isolation Forests<1>}{19}}
+\newlabel{Extended Isolation Forests}{{19}{19}{Experiments 1}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Extended Isolation Forests}{19}}
+\newlabel{fig:qual}{{19}{19}{Experiments 1}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {fig:qual}{19}}
+\@writefile{nav}{\headcommand {\slideentry {2}{0}{3}{19/19}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {19}{19}}}
+\@writefile{nav}{\headcommand {\slideentry {2}{0}{4}{20/20}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {20}{20}}}
+\@writefile{nav}{\headcommand {\slideentry {2}{0}{5}{21/21}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {21}{21}}}
+\@writefile{nav}{\headcommand {\slideentry {2}{0}{6}{22/22}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {22}{22}}}
+\@writefile{toc}{\beamer@sectionintoc {3}{Experiments 2}{23}{0}{3}}
+\@writefile{nav}{\headcommand {\beamer@sectionpages {17}{22}}}
+\@writefile{nav}{\headcommand {\beamer@subsectionpages {17}{22}}}
+\@writefile{nav}{\headcommand {\sectionentry {3}{Experiments 2}{23}{Experiments 2}{0}}}
+\newlabel{sec:Experiments 2}{{3}{23}{Experiments 2}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {sec:Experiments 2}{23}}
+\newlabel{highdim<1>}{{23}{23}{Experiments 2}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {highdim<1>}{23}}
+\newlabel{highdim}{{23}{23}{Experiments 2}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {highdim}{23}}
+\newlabel{fig:prep19highdimapng}{{23}{23}{Experiments 2}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {fig:prep19highdimapng}{23}}
+\@writefile{nav}{\headcommand {\slideentry {3}{0}{1}{23/23}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {23}{23}}}
+\newlabel{New Condition<1>}{{24}{24}{Experiments 2}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {New Condition<1>}{24}}
+\newlabel{New Condition}{{24}{24}{Experiments 2}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {New Condition}{24}}
+\@writefile{nav}{\headcommand {\slideentry {3}{0}{2}{24/24}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {24}{24}}}
+\newlabel{New Models<1>}{{25}{25}{Experiments 2}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {New Models<1>}{25}}
+\newlabel{New Models}{{25}{25}{Experiments 2}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {New Models}{25}}
+\@writefile{nav}{\headcommand {\slideentry {3}{0}{3}{25/25}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {25}{25}}}
+\@writefile{nav}{\headcommand {\slideentry {3}{0}{4}{26/26}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {26}{26}}}
+\@writefile{nav}{\headcommand {\slideentry {3}{0}{5}{27/27}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {27}{27}}}
+\@writefile{nav}{\headcommand {\slideentry {3}{0}{6}{28/28}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {28}{28}}}
+\@writefile{toc}{\beamer@sectionintoc {4}{Experiments 3}{29}{0}{4}}
+\@writefile{nav}{\headcommand {\beamer@sectionpages {23}{28}}}
+\@writefile{nav}{\headcommand {\beamer@subsectionpages {23}{28}}}
+\@writefile{nav}{\headcommand {\sectionentry {4}{Experiments 3}{29}{Experiments 3}{0}}}
+\newlabel{sec:Experiments 3}{{4}{29}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {sec:Experiments 3}{29}}
+\newlabel{Unsupervised Optimization<1>}{{29}{29}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Unsupervised Optimization<1>}{29}}
+\newlabel{Unsupervised Optimization}{{29}{29}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Unsupervised Optimization}{29}}
+\@writefile{nav}{\headcommand {\slideentry {4}{0}{1}{29/29}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {29}{29}}}
+\newlabel{Loss Optimization<1>}{{30}{30}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Loss Optimization<1>}{30}}
+\newlabel{Loss Optimization}{{30}{30}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Loss Optimization}{30}}
+\newlabel{fig:histone_page-blocks}{{30}{30}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {fig:histone_page-blocks}{30}}
+\@writefile{nav}{\headcommand {\slideentry {4}{0}{2}{30/30}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {30}{30}}}
+\newlabel{Loss Optimization<1>}{{31}{31}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Loss Optimization<1>}{31}}
+\newlabel{Loss Optimization}{{31}{31}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Loss Optimization}{31}}
+\newlabel{fig:histone_pima}{{31}{31}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {fig:histone_pima}{31}}
+\@writefile{nav}{\headcommand {\slideentry {4}{0}{3}{31/31}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {31}{31}}}
+\newlabel{loss<1>}{{32}{32}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {loss<1>}{32}}
+\newlabel{loss}{{32}{32}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {loss}{32}}
+\newlabel{fig:prep27lossz_losspdf}{{32}{32}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {fig:prep27lossz_losspdf}{32}}
+\@writefile{nav}{\headcommand {\slideentry {4}{0}{4}{32/32}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {32}{32}}}
+\newlabel{Robustness<1>}{{33}{33}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Robustness<1>}{33}}
+\newlabel{Robustness}{{33}{33}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Robustness}{33}}
+\newlabel{fig:prep28Robustnessz_robupdf}{{33}{33}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {fig:prep28Robustnessz_robupdf}{33}}
+\@writefile{nav}{\headcommand {\slideentry {4}{0}{5}{33/33}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {33}{33}}}
+\newlabel{Distance Correlation<1>}{{34}{34}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Distance Correlation<1>}{34}}
+\newlabel{Distance Correlation}{{34}{34}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Distance Correlation}{34}}
+\newlabel{fig:prep29Distance_Correlationz_distpdf}{{34}{34}{Experiments 3}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {fig:prep29Distance_Correlationz_distpdf}{34}}
+\@writefile{nav}{\headcommand {\slideentry {4}{0}{6}{34/34}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {34}{34}}}
+\@writefile{toc}{\beamer@sectionintoc {5}{Conclusion}{35}{0}{5}}
+\@writefile{nav}{\headcommand {\beamer@sectionpages {29}{34}}}
+\@writefile{nav}{\headcommand {\beamer@subsectionpages {29}{34}}}
+\@writefile{nav}{\headcommand {\sectionentry {5}{Conclusion}{35}{Conclusion}{0}}}
+\newlabel{sec:Conclusion}{{5}{35}{Conclusion}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {sec:Conclusion}{35}}
+\newlabel{Other<1>}{{35}{35}{Conclusion}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Other<1>}{35}}
+\newlabel{Other}{{35}{35}{Conclusion}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Other}{35}}
+\@writefile{nav}{\headcommand {\slideentry {5}{0}{1}{35/35}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {35}{35}}}
+\newlabel{Feedback<1>}{{36}{36}{Conclusion}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Feedback<1>}{36}}
+\newlabel{Feedback}{{36}{36}{Conclusion}{Doc-Start}{}}
+\@writefile{snm}{\beamer@slide {Feedback}{36}}
+\@writefile{nav}{\headcommand {\slideentry {5}{0}{2}{36/36}{}{0}}}
+\@writefile{nav}{\headcommand {\beamer@framepages {36}{36}}}
+\@writefile{nav}{\headcommand {\beamer@partpages {1}{36}}}
+\@writefile{nav}{\headcommand {\beamer@subsectionpages {35}{36}}}
+\@writefile{nav}{\headcommand {\beamer@sectionpages {35}{36}}}
+\@writefile{nav}{\headcommand {\beamer@documentpages {36}}}
+\@writefile{nav}{\headcommand {\gdef \inserttotalframenumber {36}}}
+\gdef \@abspage@last{36}
--- a/out/main.log
+++ b/out/main.log
--- a/out/main.nav
+++ b/out/main.nav
@ -0,0 +1,92 @@
+\headcommand {\slideentry {0}{0}{1}{1/1}{}{0}}
+\headcommand {\beamer@framepages {1}{1}}
+\headcommand {\slideentry {0}{0}{2}{2/2}{}{0}}
+\headcommand {\beamer@framepages {2}{2}}
+\headcommand {\slideentry {0}{0}{3}{3/3}{}{0}}
+\headcommand {\beamer@framepages {3}{3}}
+\headcommand {\slideentry {0}{0}{4}{4/4}{}{0}}
+\headcommand {\beamer@framepages {4}{4}}
+\headcommand {\slideentry {0}{0}{5}{5/5}{}{0}}
+\headcommand {\beamer@framepages {5}{5}}
+\headcommand {\beamer@sectionpages {1}{5}}
+\headcommand {\beamer@subsectionpages {1}{5}}
+\headcommand {\sectionentry {1}{Basics}{6}{Basics}{0}}
+\headcommand {\slideentry {1}{0}{1}{6/6}{}{0}}
+\headcommand {\beamer@framepages {6}{6}}
+\headcommand {\slideentry {1}{0}{2}{7/7}{}{0}}
+\headcommand {\beamer@framepages {7}{7}}
+\headcommand {\slideentry {1}{0}{3}{8/8}{}{0}}
+\headcommand {\beamer@framepages {8}{8}}
+\headcommand {\slideentry {1}{0}{4}{9/9}{}{0}}
+\headcommand {\beamer@framepages {9}{9}}
+\headcommand {\slideentry {1}{0}{5}{10/10}{}{0}}
+\headcommand {\beamer@framepages {10}{10}}
+\headcommand {\slideentry {1}{0}{6}{11/11}{}{0}}
+\headcommand {\beamer@framepages {11}{11}}
+\headcommand {\slideentry {1}{0}{7}{12/12}{}{0}}
+\headcommand {\beamer@framepages {12}{12}}
+\headcommand {\slideentry {1}{0}{8}{13/13}{}{0}}
+\headcommand {\beamer@framepages {13}{13}}
+\headcommand {\slideentry {1}{0}{9}{14/14}{}{0}}
+\headcommand {\beamer@framepages {14}{14}}
+\headcommand {\slideentry {1}{0}{10}{15/15}{}{0}}
+\headcommand {\beamer@framepages {15}{15}}
+\headcommand {\slideentry {1}{0}{11}{16/16}{}{0}}
+\headcommand {\beamer@framepages {16}{16}}
+\headcommand {\beamer@sectionpages {6}{16}}
+\headcommand {\beamer@subsectionpages {6}{16}}
+\headcommand {\sectionentry {2}{Experiments 1}{17}{Experiments 1}{0}}
+\headcommand {\slideentry {2}{0}{1}{17/17}{}{0}}
+\headcommand {\beamer@framepages {17}{17}}
+\headcommand {\slideentry {2}{0}{2}{18/18}{}{0}}
+\headcommand {\beamer@framepages {18}{18}}
+\headcommand {\slideentry {2}{0}{3}{19/19}{}{0}}
+\headcommand {\beamer@framepages {19}{19}}
+\headcommand {\slideentry {2}{0}{4}{20/20}{}{0}}
+\headcommand {\beamer@framepages {20}{20}}
+\headcommand {\slideentry {2}{0}{5}{21/21}{}{0}}
+\headcommand {\beamer@framepages {21}{21}}
+\headcommand {\slideentry {2}{0}{6}{22/22}{}{0}}
+\headcommand {\beamer@framepages {22}{22}}
+\headcommand {\beamer@sectionpages {17}{22}}
+\headcommand {\beamer@subsectionpages {17}{22}}
+\headcommand {\sectionentry {3}{Experiments 2}{23}{Experiments 2}{0}}
+\headcommand {\slideentry {3}{0}{1}{23/23}{}{0}}
+\headcommand {\beamer@framepages {23}{23}}
+\headcommand {\slideentry {3}{0}{2}{24/24}{}{0}}
+\headcommand {\beamer@framepages {24}{24}}
+\headcommand {\slideentry {3}{0}{3}{25/25}{}{0}}
+\headcommand {\beamer@framepages {25}{25}}
+\headcommand {\slideentry {3}{0}{4}{26/26}{}{0}}
+\headcommand {\beamer@framepages {26}{26}}
+\headcommand {\slideentry {3}{0}{5}{27/27}{}{0}}
+\headcommand {\beamer@framepages {27}{27}}
+\headcommand {\slideentry {3}{0}{6}{28/28}{}{0}}
+\headcommand {\beamer@framepages {28}{28}}
+\headcommand {\beamer@sectionpages {23}{28}}
+\headcommand {\beamer@subsectionpages {23}{28}}
+\headcommand {\sectionentry {4}{Experiments 3}{29}{Experiments 3}{0}}
+\headcommand {\slideentry {4}{0}{1}{29/29}{}{0}}
+\headcommand {\beamer@framepages {29}{29}}
+\headcommand {\slideentry {4}{0}{2}{30/30}{}{0}}
+\headcommand {\beamer@framepages {30}{30}}
+\headcommand {\slideentry {4}{0}{3}{31/31}{}{0}}
+\headcommand {\beamer@framepages {31}{31}}
+\headcommand {\slideentry {4}{0}{4}{32/32}{}{0}}
+\headcommand {\beamer@framepages {32}{32}}
+\headcommand {\slideentry {4}{0}{5}{33/33}{}{0}}
+\headcommand {\beamer@framepages {33}{33}}
+\headcommand {\slideentry {4}{0}{6}{34/34}{}{0}}
+\headcommand {\beamer@framepages {34}{34}}
+\headcommand {\beamer@sectionpages {29}{34}}
+\headcommand {\beamer@subsectionpages {29}{34}}
+\headcommand {\sectionentry {5}{Conclusion}{35}{Conclusion}{0}}
+\headcommand {\slideentry {5}{0}{1}{35/35}{}{0}}
+\headcommand {\beamer@framepages {35}{35}}
+\headcommand {\slideentry {5}{0}{2}{36/36}{}{0}}
+\headcommand {\beamer@framepages {36}{36}}
+\headcommand {\beamer@partpages {1}{36}}
+\headcommand {\beamer@subsectionpages {35}{36}}
+\headcommand {\beamer@sectionpages {35}{36}}
+\headcommand {\beamer@documentpages {36}}
+\headcommand {\gdef \inserttotalframenumber {36}}
--- a/out/main.out
+++ b/out/main.out
@ -0,0 +1,5 @@
+\BOOKMARK [2][]{Outline0.1}{\376\377\000B\000a\000s\000i\000c\000s}{}% 1
+\BOOKMARK [2][]{Outline0.2}{\376\377\000E\000x\000p\000e\000r\000i\000m\000e\000n\000t\000s\000\040\0001}{}% 2
+\BOOKMARK [2][]{Outline0.3}{\376\377\000E\000x\000p\000e\000r\000i\000m\000e\000n\000t\000s\000\040\0002}{}% 3
+\BOOKMARK [2][]{Outline0.4}{\376\377\000E\000x\000p\000e\000r\000i\000m\000e\000n\000t\000s\000\040\0003}{}% 4
+\BOOKMARK [2][]{Outline0.5}{\376\377\000C\000o\000n\000c\000l\000u\000s\000i\000o\000n}{}% 5
--- a/out/main.pdf
+++ b/out/main.pdf
--- a/out/main.snm
+++ b/out/main.snm
@ -0,0 +1,71 @@
+\beamer@slide {Problem<1>}{2}
+\beamer@slide {Problem}{2}
+\beamer@slide {Students<1>}{3}
+\beamer@slide {Students}{3}
+\beamer@slide {yano<1>}{4}
+\beamer@slide {yano}{4}
+\beamer@slide {yano<1>}{5}
+\beamer@slide {yano}{5}
+\beamer@slide {fig:prep04yanoapng}{5}
+\beamer@slide {sec:Basics}{6}
+\beamer@slide {selector<1>}{6}
+\beamer@slide {selector}{6}
+\beamer@slide {selectors<1>}{7}
+\beamer@slide {selectors}{7}
+\beamer@slide {iterating<1>}{8}
+\beamer@slide {iterating}{8}
+\beamer@slide {iterating<1>}{9}
+\beamer@slide {iterating}{9}
+\beamer@slide {pipeline<1>}{10}
+\beamer@slide {pipeline}{10}
+\beamer@slide {pipeline<1>}{11}
+\beamer@slide {pipeline}{11}
+\beamer@slide {CrossValidation<1>}{12}
+\beamer@slide {CrossValidation}{12}
+\beamer@slide {Logging<1>}{13}
+\beamer@slide {Logging}{13}
+\beamer@slide {Seeding<1>}{14}
+\beamer@slide {Seeding}{14}
+\beamer@slide {statistics<1>}{16}
+\beamer@slide {statistics}{16}
+\beamer@slide {sec:Experiments 1}{17}
+\beamer@slide {Extended Isolation Forests<1>}{17}
+\beamer@slide {Extended Isolation Forests}{17}
+\beamer@slide {fig:ifor}{17}
+\beamer@slide {Extended Isolation Forests<1>}{18}
+\beamer@slide {Extended Isolation Forests}{18}
+\beamer@slide {fig:eifor}{18}
+\beamer@slide {Extended Isolation Forests<1>}{19}
+\beamer@slide {Extended Isolation Forests}{19}
+\beamer@slide {fig:qual}{19}
+\beamer@slide {sec:Experiments 2}{23}
+\beamer@slide {highdim<1>}{23}
+\beamer@slide {highdim}{23}
+\beamer@slide {fig:prep19highdimapng}{23}
+\beamer@slide {New Condition<1>}{24}
+\beamer@slide {New Condition}{24}
+\beamer@slide {New Models<1>}{25}
+\beamer@slide {New Models}{25}
+\beamer@slide {sec:Experiments 3}{29}
+\beamer@slide {Unsupervised Optimization<1>}{29}
+\beamer@slide {Unsupervised Optimization}{29}
+\beamer@slide {Loss Optimization<1>}{30}
+\beamer@slide {Loss Optimization}{30}
+\beamer@slide {fig:histone_page-blocks}{30}
+\beamer@slide {Loss Optimization<1>}{31}
+\beamer@slide {Loss Optimization}{31}
+\beamer@slide {fig:histone_pima}{31}
+\beamer@slide {loss<1>}{32}
+\beamer@slide {loss}{32}
+\beamer@slide {fig:prep27lossz_losspdf}{32}
+\beamer@slide {Robustness<1>}{33}
+\beamer@slide {Robustness}{33}
+\beamer@slide {fig:prep28Robustnessz_robupdf}{33}
+\beamer@slide {Distance Correlation<1>}{34}
+\beamer@slide {Distance Correlation}{34}
+\beamer@slide {fig:prep29Distance_Correlationz_distpdf}{34}
+\beamer@slide {sec:Conclusion}{35}
+\beamer@slide {Other<1>}{35}
+\beamer@slide {Other}{35}
+\beamer@slide {Feedback<1>}{36}
+\beamer@slide {Feedback}{36}
--- a/out/main.tex
+++ b/out/main.tex
--- a/out/main.toc
+++ b/out/main.toc
@ -0,0 +1,5 @@
+\beamer@sectionintoc {1}{Basics}{6}{0}{1}
+\beamer@sectionintoc {2}{Experiments 1}{17}{0}{2}
+\beamer@sectionintoc {3}{Experiments 2}{23}{0}{3}
+\beamer@sectionintoc {4}{Experiments 3}{29}{0}{4}
+\beamer@sectionintoc {5}{Conclusion}{35}{0}{5}
--- a/prep/000/nonl
+++ b/prep/000/nonl
--- a/prep/000/q
+++ b/prep/000/q
@ -0,0 +1 @@
+<titlepage>
--- a/prep/01Problem/q
+++ b/prep/01Problem/q
@ -0,0 +1,10 @@
+Paper with Benedikt
+require multiple very specific datasets
+<l2st>
+many but not to many features
+at least some samples (for the NN)
+Only numerical attributes best
+specific quality
+unrelated datasets
+</l2st>
+Requires you to search for many datasets and filter them
--- a/prep/02Students/q
+++ b/prep/02Students/q
@ -0,0 +1,6 @@
+Not clear what you can use
+Many different formats
+train/test splits
+So for Students I just do this work and send them archives directly
+->Not a good solution
+
--- a/prep/03yano/q
+++ b/prep/03yano/q
@ -0,0 +1,8 @@
+So I have been packaging all my scripts
+I had surprisingly much fun doing this
+<l2st>
+More than just standard functions
+A couple of weird decisions
+And this will likely grow further
+</l2st>
+->So I would like to discuss some parts with you and maybe you even have more features you might want
--- a/prep/04yano/a.png
+++ b/prep/04yano/a.png
--- a/prep/04yano/q
+++ b/prep/04yano/q
@ -0,0 +1,6 @@
+Simply install it over pip
+Contains 187 real-World Datasets
+->biggest library of datasets explicitely for anomaly detection
+not yet happy with this
+especially only mostly contains numerical and nominal attributes
+->few categorical and no time-series attributes
--- a/prep/05selector/nonl
+++ b/prep/05selector/nonl
--- a/prep/05selector/q
+++ b/prep/05selector/q
@ -0,0 +1,17 @@
+<code>
+import yano
+from yano.symbols import *
+
+
+condition= (number_of_features>5) &
+           (number_of_features<100) &
+           (number_of_samples>100) &
+           (number_of_samples<10000) &
+           (number_of_samples>2*number_of_features) &
+           ~index
+
+print(len(condition), "Datasets found")
+
+
+</code>
+->33 Datasets found
--- a/prep/06selectors/q
+++ b/prep/06selectors/q
@ -0,0 +1,26 @@
+Lots of symbols like this
+<l2st>
+name
+number\_of\_features
+number\_of\_samples
+index (correlated datasets)
+</l2st>
+Feature types
+<l2st>
+numeric
+nominal
+categorical
+(textual)
+</l2st>
+Count based
+<l2st>
+number\_anomalies
+number\_normals
+fraction\_anomalies
+</l2st>
+Specific ones
+<l2st>
+image\_based
+(linearly\_seperable)
+</l2st>
+
--- a/prep/07iterating/nonl
+++ b/prep/07iterating/nonl
--- a/prep/07iterating/q
+++ b/prep/07iterating/q
@ -0,0 +1,14 @@
+<code>
+for dataset in condition:
+    print(condition)
+
+
+</code>
+<l2st>
+<e>\[annthyroid\]</e>
+<e>\[breastw\]</e>
+<e>\[cardio\]</e>
+<e>\[...\]</e>
+<e>\[Housing\_low\]</e>
+</l2st>
+
--- a/prep/08iterating/nonl
+++ b/prep/08iterating/nonl
--- a/prep/08iterating/q
+++ b/prep/08iterating/q
@ -0,0 +1,8 @@
+<code>
+
+for dataset in condition:
+    x=dataset.getx()
+    y=dataset.gety()
+
+
+</code>
--- a/prep/09pipeline/nonl
+++ b/prep/09pipeline/nonl
--- a/prep/09pipeline/q
+++ b/prep/09pipeline/q
@ -0,0 +1,15 @@
+<code>
+
+from yano.iter import *
+
+for dataset, x,tx,ty in pipeline(condition, 
+                                 split,
+                                 shuffle,
+                                 normalize("minmax")):
+    ...
+
+
+
+
+</code>
+
--- a/prep/10pipeline/q
+++ b/prep/10pipeline/q
@ -0,0 +1,12 @@
+Again there are a couple modifiers possible
+<l2st>
+nonconst->remove constant features
+shuffle
+normalize('zscore'/'minmax')
+cut(10)->at most 10 datasets
+split->train test split, all anomalies in test set
+crossval(5)->similar to split, but do multiple times (crossvalidation)
+</l2st>
+modifiers interact with each other
+For example: normalize('minmax'), split
+->train set always below 1, but no guarantees for the test set
--- a/prep/11CrossValidation/q
+++ b/prep/11CrossValidation/q
@ -0,0 +1,14 @@
+Learned from DMC: Crossvalidation is important
+Rarely found in Anomaly Detection, why?
+A bit more complicated (not all samples are equal), but no reason why not
+->So I implemented it into yano
+<l2st>
+folding only on normal data
+How to handle anomalies?
+If not folding them, cross-validation less useful
+if folding them, often rare anomalies even more rare
+->test set always 50\% anomalous
+->Also improves simple evaluation metrics (accuracy)
+</l2st>
+Do you know a reason why Cross Validation is not common in AD?
+Are there Problems with the way I fold my Anomalies?
--- a/prep/12Logging/nonl
+++ b/prep/12Logging/nonl
--- a/prep/12Logging/q
+++ b/prep/12Logging/q
@ -0,0 +1,21 @@
+<code>
+from yano.logging import Logger
+from pyod.models.iforest import IForest
+from extended_iforest import train_extended_ifor
+
+l=Logger({"IFor":IForest(n_estimators=100),
+          "eIFor":train_extended_ifor})
+
+for dataset, folds in pipeline(condition,
+                               crossval(5),
+                               normalize("minmax"),
+                               shuffle):
+    l.run_cross(dataset, folds)
+
+latex=l.to_latex()
+
+
+</code>
+
+
+
--- a/prep/12Seeding/q
+++ b/prep/12Seeding/q
@ -0,0 +1,5 @@
+If you dont do anything, everything is seeded.
+Makes rerunning a Model until the performance is good quite obvious.
+But as every Run is seeded itself, this might induce bias.
+Do you think this is worth it?
+Are there any Problems with this?
--- a/prep/13/nonl
+++ b/prep/13/nonl
--- a/prep/13/q
+++ b/prep/13/q
@ -0,0 +1,21 @@
+\begin{tabular}{lll}
+\hline
+ Dataset                   & eIFor                         & IFor                          \\
+\hline
+ $pc3$                     & $\textbf{0.7231}  \pm 0.0153$ & $\textbf{0.7223}  \pm 0.0178$ \\
+ $pima$                    & $\textbf{0.7405}  \pm 0.0110$ & $\textbf{0.7347}  \pm 0.0126$ \\
+ $Diabetes\_present$       & $\textbf{0.7414}  \pm 0.0195$ & $\textbf{0.7344}  \pm 0.0242$ \\
+ $waveform-5000$           & $\textbf{0.7687}  \pm 0.0123$ & $\textbf{0.7592}  \pm 0.0206$ \\
+ $vowels$                  & $\textbf{0.7843}  \pm 0.0298$ & $\textbf{0.7753}  \pm 0.0334$ \\
+ $Vowel\_0$                & $\textbf{0.8425}  \pm 0.0698$ & $0.7193 \pm 0.0817$           \\
+ $Abalone\_1\_8$           & $\textbf{0.8525}  \pm 0.0263$ & $0.8452 \pm 0.0257$           \\
+ $annthyroid$              & $0.8399 \pm 0.0135$           & $\textbf{0.9087}  \pm 0.0090$ \\
+ $Vehicle\_van$            & $\textbf{0.8792}  \pm 0.0265$ & $\textbf{0.8697}  \pm 0.0383$ \\
+ $ionosphere$              & $\textbf{0.9320}  \pm 0.0069$ & $0.9086 \pm 0.0142$           \\
+ $breastw$                 & $\textbf{0.9948}  \pm 0.0031$ & $\textbf{0.9952}  \pm 0.0033$ \\
+ $segment$                 & $\textbf{1.0}$                & $\textbf{0.9993}  \pm 0.0015$ \\
+ $$                        & $$                            & $$                            \\
+ $Average$                 & $\textbf{0.8005}$ & $\textbf{0.7957}$ \\
+\hline
+\end{tabular}
+
--- a/prep/14statistics/q
+++ b/prep/14statistics/q
@ -0,0 +1,7 @@
+Friedman test to see if there is a difference between models
+Nemenyi test to see which models are equal, mark those equal to the maximum
+For 2 models, Friedman not defined -> use Wilcoxon test
+
+Does this match your expectation from the table?
+Two models are 'equal' if their probability of being from the same distribution is #LessThan(p_b,p)#, what value should #Eq(p_b,0.1)# have?
+Do I need to correct for p hacking (n experiments, so increase the difficulty for each, or is that clear from the table)
--- a/prep/15Extended_Isolation_Forests/q
+++ b/prep/15Extended_Isolation_Forests/q
@ -0,0 +1,9 @@
+Isolation Forests are one algorithm for AD
+Tries to isolate abnormal (rare) points instead of modelling normal ones
+Creative approach->fairly successful (3000 Citations)
+Many follow up papers
+Extended Isolation Forest (Hariri et. al. 2018, 140 Citations)
+Remove bias from the Isolation Forests
+Also claim to improve their anomaly detection quality
+(repeat with both cuts and ad quality)
+
--- a/prep/16/nonl
+++ b/prep/16/nonl
--- a/prep/16/q
+++ b/prep/16/q
@ -0,0 +1,19 @@
+\begin{tabular}{lll}
+\hline
+ Dataset                   & eIFor                         & IFor                          \\
+\hline
+ $Delft\_pump\_5x3\_noisy$ & $\textbf{0.3893}  \pm 0.0345$ & $\textbf{0.4272}  \pm 0.0680$ \\
+ $vertebral$               & $\textbf{0.4260}  \pm 0.0111$ & $\textbf{0.4554}  \pm 0.0416$ \\
+ $Liver\_1$                & $0.5367 \pm 0.0508$           & $\textbf{0.5474}  \pm 0.0541$ \\
+ $Sonar\_mines$            & $\textbf{0.6882}  \pm 0.1264$ & $0.6189 \pm 0.1301$           \\
+ $letter$                  & $\textbf{0.6756}  \pm 0.0119$ & $0.6471 \pm 0.0111$           \\
+ $Glass\_building\_float$  & $\textbf{0.6480}  \pm 0.1012$ & $\textbf{0.6755}  \pm 0.1117$ \\
+ $pc3$                     & $\textbf{0.7231}  \pm 0.0153$ & $\textbf{0.7223}  \pm 0.0178$ \\
+ $pima$                    & $\textbf{0.7405}  \pm 0.0110$ & $\textbf{0.7347}  \pm 0.0126$ \\
+ $Diabetes\_present$       & $\textbf{0.7414}  \pm 0.0195$ & $\textbf{0.7344}  \pm 0.0242$ \\
+ $waveform-5000$           & $\textbf{0.7687}  \pm 0.0123$ & $\textbf{0.7592}  \pm 0.0206$ \\
+ $steel-plates-fault$      & $\textbf{0.7735}  \pm 0.0351$ & $\textbf{0.7682}  \pm 0.0402$ \\
+ $vowels$                  & $\textbf{0.7843}  \pm 0.0298$ & $\textbf{0.7753}  \pm 0.0334$ \\
+\hline
+\end{tabular}
+
--- a/prep/17/nonl
+++ b/prep/17/nonl
--- a/prep/17/q
+++ b/prep/17/q
@ -0,0 +1,19 @@
+\begin{tabular}{lll}
+\hline
+ Dataset                   & eIFor                         & IFor                          \\
+\hline
+ $Vowel\_0$                & $\textbf{0.8425}  \pm 0.0698$ & $0.7193 \pm 0.0817$           \\
+ $Housing\_low$            & $\textbf{0.7807}  \pm 0.0333$ & $\textbf{0.7862}  \pm 0.0336$ \\
+ $ozone-level-8hr$         & $\textbf{0.7904}  \pm 0.0207$ & $\textbf{0.7768}  \pm 0.0118$ \\
+ $Spectf\_0$               & $\textbf{0.8155}  \pm 0.0255$ & $0.7535 \pm 0.0239$           \\
+ $HeartC$                  & $0.7795 \pm 0.0258$           & $\textbf{0.8079}  \pm 0.0255$ \\
+ $satellite$               & $\textbf{0.8125}  \pm 0.0170$ & $\textbf{0.8103}  \pm 0.0061$ \\
+ $optdigits$               & $\textbf{0.8099}  \pm 0.0310$ & $\textbf{0.8142}  \pm 0.0267$ \\
+ $spambase$                & $\textbf{0.8085}  \pm 0.0110$ & $\textbf{0.8202}  \pm 0.0042$ \\
+ $Abalone\_1\_8$           & $\textbf{0.8525}  \pm 0.0263$ & $0.8452 \pm 0.0257$           \\
+ $qsar-biodeg$             & $\textbf{0.8584}  \pm 0.0119$ & $\textbf{0.8628}  \pm 0.0135$ \\
+ $annthyroid$              & $0.8399 \pm 0.0135$           & $\textbf{0.9087}  \pm 0.0090$ \\
+ $Vehicle\_van$            & $\textbf{0.8792}  \pm 0.0265$ & $\textbf{0.8697}  \pm 0.0383$ \\
+\hline
+\end{tabular}
+
--- a/prep/18/nonl
+++ b/prep/18/nonl
--- a/prep/18/q
+++ b/prep/18/q
@ -0,0 +1,18 @@
+\begin{tabular}{lll}
+\hline
+ Dataset                   & eIFor                         & IFor                          \\
+\hline
+ $ionosphere$              & $\textbf{0.9320}  \pm 0.0069$ & $0.9086 \pm 0.0142$           \\
+ $page-blocks$             & $0.9189 \pm 0.0061$           & $\textbf{0.9299}  \pm 0.0016$ \\
+ $Ecoli$                   & $\textbf{0.9418}  \pm 0.0292$ & $0.9192 \pm 0.0332$           \\
+ $cardio$                  & $\textbf{0.9564}  \pm 0.0043$ & $\textbf{0.9535}  \pm 0.0036$ \\
+ $wbc$                     & $\textbf{0.9611}  \pm 0.0121$ & $\textbf{0.9607}  \pm 0.0107$ \\
+ $pendigits$               & $\textbf{0.9641}  \pm 0.0097$ & $\textbf{0.9652}  \pm 0.0076$ \\
+ $thyroid$                 & $0.9818 \pm 0.0024$           & $\textbf{0.9871}  \pm 0.0025$ \\
+ $breastw$                 & $\textbf{0.9948}  \pm 0.0031$ & $\textbf{0.9952}  \pm 0.0033$ \\
+ $segment$                 & $\textbf{1.0}$                & $\textbf{0.9993}  \pm 0.0015$ \\
+ $$                        & $$                            & $$                            \\
+ $Average$                 & $\textbf{0.8005}  \pm 0.1458$ & $\textbf{0.7957}  \pm 0.1431$ \\
+\hline
+\end{tabular}
+
--- a/prep/19highdim/a.png
+++ b/prep/19highdim/a.png
--- a/prep/20New_Condition/nonl
+++ b/prep/20New_Condition/nonl
--- a/prep/20New_Condition/q
+++ b/prep/20New_Condition/q
@ -0,0 +1,13 @@
+<code>
+
+condition= (number_of_samples>200) &
+           (number_of_samples<10000) &
+           (number_of_features>50) &
+           (number_of_features<500) &
+           ~index
+
+print(len(condition),"Datasets found")
+
+
+</code>
+->13 Datasets found
--- a/prep/21New_Models/nonl
+++ b/prep/21New_Models/nonl
--- a/prep/21New_Models/q
+++ b/prep/21New_Models/q
@ -0,0 +1,13 @@
+<code>
+from pyod.models.iforest import IForest
+from pyod.models.knn import KNN
+from pyod.models.lof import LOF
+
+
+l=Logger({"IFor":Iforest(n_estimators=100),
+          "Lof":LOF(),
+          "Knn": KNN()}, addfeat=True)
+
+
+
+</code>
--- a/prep/22/nonl
+++ b/prep/22/nonl
--- a/prep/22/q
+++ b/prep/22/q
@ -0,0 +1,21 @@
+\begin{tabular}{llll}
+\hline
+ Dataset                       & Knn                           & Lof                           & IFor                          \\
+\hline
+ $Delft\_pump\_5x3\_noisy(64)$ & $0.3800 \pm 0.0475$           & $0.3462 \pm 0.0327$           & $\textbf{0.4272}  \pm 0.0680$ \\
+ $hill-valley(100)$            & $0.4744 \pm 0.0269$           & $\textbf{0.5060}  \pm 0.0327$ & $0.4720 \pm 0.0288$           \\
+ $speech(400)$                 & $0.4903 \pm 0.0103$           & $\textbf{0.5104}  \pm 0.0115$ & $0.4872 \pm 0.0184$           \\
+ $Sonar\_mines(60)$            & $\textbf{0.7284}  \pm 0.0939$ & $0.6769 \pm 0.0933$           & $0.6189 \pm 0.1301$           \\
+ $ozone-level-8hr(72)$         & $\textbf{0.8051}  \pm 0.0288$ & $0.7738 \pm 0.0292$           & $\textbf{0.7768}  \pm 0.0118$ \\
+ $spambase(57)$                & $0.8038 \pm 0.0125$           & $0.7712 \pm 0.0055$           & $\textbf{0.8202}  \pm 0.0042$ \\
+ $arrhythmia(274)$             & $\textbf{0.8137}  \pm 0.0185$ & $0.8042 \pm 0.0186$           & $\textbf{0.8086}  \pm 0.0099$ \\
+ $mnist(100)$                  & $0.9345 \pm 0.0039$           & $\textbf{0.9548}  \pm 0.0037$ & $0.8732 \pm 0.0069$           \\
+ $Concordia3\_32(256)$         & $0.9246 \pm 0.0107$           & $\textbf{0.9486}  \pm 0.0099$ & $\textbf{0.9322}  \pm 0.0178$ \\
+ $optdigits(64)$               & $0.9966 \pm 0.0012$           & $\textbf{0.9975}  \pm 0.0012$ & $0.8142 \pm 0.0267$           \\
+ $gas-drift(128)$              & $\textbf{0.9790}  \pm 0.0018$ & $0.9585 \pm 0.0055$           & $0.8764 \pm 0.0166$           \\
+ $Delft\_pump\_AR(160)$        & $\textbf{0.9965}$             & $\textbf{0.9953}  \pm 0.0019$ & $0.9665 \pm 0.0096$           \\
+ $musk(166)$                   & $\textbf{1.0}$                & $\textbf{1.0}$                & $0.9808 \pm 0.0117$           \\
+ $$                            & $$                            & $$                            & $$                            \\
+ $Average$                     & $\textbf{0.7944}$ & $\textbf{0.7879}$ & $0.7580$           \\
+\hline
+\end{tabular}
--- a/prep/23/nonl
+++ b/prep/23/nonl
--- a/prep/23/q
+++ b/prep/23/q
@ -0,0 +1,7 @@
+<l2st>
+<e>Hypothesis: Isolation Forests are better when there are numerical and nominal attributes</e>
+<e>Easy to test</e>
+</l2st>
+<code>
+condition=condition & (numeric & nominal)
+</code>
--- a/prep/24/nonl
+++ b/prep/24/nonl
--- a/prep/24/q
+++ b/prep/24/q
@ -0,0 +1,19 @@
+\begin{tabular}{llll}
+\hline
+ Dataset               & Knn                           & IFor                          & Lof                           \\
+\hline
+ $ozone-level-8hr(72)$ & $\textbf{0.8051}  \pm 0.0288$ & $\textbf{0.7768}  \pm 0.0118$ & $0.7738 \pm 0.0292$           \\
+ $spambase(57)$        & $0.8038 \pm 0.0125$           & $\textbf{0.8202}  \pm 0.0042$ & $0.7712 \pm 0.0055$           \\
+ $arrhythmia(274)$     & $\textbf{0.8137}  \pm 0.0185$ & $\textbf{0.8086}  \pm 0.0099$ & $0.8042 \pm 0.0186$           \\
+ $musk(166)$           & $\textbf{1.0}$                & $0.9808 \pm 0.0117$           & $\textbf{1.0}$                \\
+ $$                    & $$                            & $$                            & $$                            \\
+ $Average$             & $\textbf{0.8556}$ & $\textbf{0.8466}$ & $\textbf{0.8373}$ \\
+\hline
+\end{tabular}
+
+<l2st>
+<e>Only 4 datasets, so not clear at all</e>
+<e>->More datasets</e>
+
+</l2st>
+
--- a/prep/25Unsupervised_Optimization/q
+++ b/prep/25Unsupervised_Optimization/q
@ -0,0 +1,7 @@
+There are analysis that are only possible with many datasets
+Here: unsupervised optimization
+Given multiple AD models, find which is best:
+Use AUC score? Requires Anomalies->Overfitting
+Can you find an unsupervised Method?
+In general very complicated, so here only focus on very small differences in the model.
+So each model is an autoencoder, trained on the same dataset, where the difference is only in the initialisation
--- a/prep/26Loss_Optimization/q
+++ b/prep/26Loss_Optimization/q
@ -0,0 +1,8 @@
+First guess Loss of the Model on the training Data
+How to evaluate this?
+Train many models, look at the average AUC score.
+For the alternative, take groups of 20 models, and look at the AUC score of the best model.
+Is there a meaningfull difference between results? Give result as z\_score (#(m_1-m_2)/sqrt(s_1**2+s_2**2)#)
+This difference depends a lot on the dataset
+->even a really good z\_score does not mean much (sometimes #LessThan(30,z)#)
+(repeat with two histones)
--- a/prep/27loss/q
+++ b/prep/27loss/q
@ -0,0 +1 @@
+Pick the Model with the lowest l2\-loss
--- a/prep/27loss/z_loss.pdf
+++ b/prep/27loss/z_loss.pdf
--- a/prep/28Robustness/q
+++ b/prep/28Robustness/q
@ -0,0 +1,3 @@
+Pick points with 1\% width difference in input space around each point.
+for each point, find the maximum difference in output space.
+average this difference
--- a/prep/28Robustness/z_robu.pdf
+++ b/prep/28Robustness/z_robu.pdf
--- a/prep/29Distance_Correlation/q
+++ b/prep/29Distance_Correlation/q
@ -0,0 +1,3 @@
+Pick random points in the input space.
+measure the distance in input and output space
+a low correlation is a good model
--- a/prep/29Distance_Correlation/z_dist.pdf
+++ b/prep/29Distance_Correlation/z_dist.pdf
--- a/prep/30Other/q
+++ b/prep/30Other/q
@ -0,0 +1,9 @@
+Things I still want to add:
+<l2st>
+Ensemble Methods
+Visualisation options
+Alternative Evaluations
+Hyperparameter optimisation (with crossvalidation)
+
+
+</l2st>
--- a/prep/31Feedback/q
+++ b/prep/31Feedback/q
@ -0,0 +1,3 @@
+What do you think about this?
+Is there something I should also add?
+What would you need for you to actually use this?
--- a/touse/Ethereum-Price-Prediction.webp
+++ b/touse/Ethereum-Price-Prediction.webp
--- a/touse/TIM220328_Buterin.Cover_.FINAL2_.jpg
+++ b/touse/TIM220328_Buterin.Cover_.FINAL2_.jpg
--- a/Show More
+++ b/Show More
				`@ -0,0 +1 @@`
				`Subproject commit 62ffd6ae589d7983791feea9d44d7658534d54a0`