initial push

This commit is contained in:
Simon Klüttermann 2022-05-23 18:49:15 +02:00
commit ce067dbbf2
101 changed files with 3885 additions and 0 deletions

5
data/000.txt Normal file
View File

@ -0,0 +1,5 @@
<frame >
<titlepage>
</frame>

14
data/001Problem.txt Normal file
View File

@ -0,0 +1,14 @@
<frame title="Problem">
<list>
<e>Paper with Benedikt</e>
<e>require multiple very specific datasets</e>
<l2st>
<e>many but not to many features</e>
<e>at least some samples (for the NN)</e>
<e>Only numerical attributes best</e>
<e>specific quality</e>
<e>unrelated datasets</e>
</l2st>
<e>Requires you to search for many datasets and filter them</e>
</list>
</frame>

9
data/002Students.txt Normal file
View File

@ -0,0 +1,9 @@
<frame title="Students">
<list>
<e>Not clear what you can use</e>
<e>Many different formats</e>
<e>train/test splits</e>
<e>So for Students I just do this work and send them archives directly</e>
<e>->Not a good solution</e>
</list>
</frame>

12
data/003yano.txt Normal file
View File

@ -0,0 +1,12 @@
<frame title="yano">
<list>
<e>So I have been packaging all my scripts</e>
<e>I had surprisingly much fun doing this</e>
<l2st>
<e>More than just standard functions</e>
<e>A couple of weird decisions</e>
<e>And this will likely grow further</e>
</l2st>
<e>->So I would like to discuss some parts with you and maybe you even have more features you might want</e>
</list>
</frame>

17
data/004yano.txt Normal file
View File

@ -0,0 +1,17 @@
<frame title="yano">
<split>
<que>
<list>
<e>Simply install it over pip</e>
<e>Contains 187 real-World Datasets</e>
<e>->biggest library of datasets explicitely for anomaly detection</e>
<e>not yet happy with this</e>
<e>especially only mostly contains numerical and nominal attributes</e>
<e>->few categorical and no time-series attributes</e>
</list>
</que>
<que>
<i f="../prep/04yano/a.png" wmode="True"></i>
</que>
</split>
</frame>

17
data/005selector.txt Normal file
View File

@ -0,0 +1,17 @@
<section Basics>
<frame title="selector">
<code>
import yano
from yano.symbols import *
condition= (number_of_features>5) &
(number_of_features<100) &
(number_of_samples>100) &
(number_of_samples<10000) &
(number_of_samples>2*number_of_features) &
~index
print(len(condition), "Datasets found")
</code>
->33 Datasets found
</frame>

29
data/006selectors.txt Normal file
View File

@ -0,0 +1,29 @@
<frame title="selectors">
<list>
<e>Lots of symbols like this</e>
<l2st>
<e>name</e>
<e>number\_of\_features</e>
<e>number\_of\_samples</e>
<e>index (correlated datasets)</e>
</l2st>
<e>Feature types</e>
<l2st>
<e>numeric</e>
<e>nominal</e>
<e>categorical</e>
<e>(textual)</e>
</l2st>
<e>Count based</e>
<l2st>
<e>number\_anomalies</e>
<e>number\_normals</e>
<e>fraction\_anomalies</e>
</l2st>
<e>Specific ones</e>
<l2st>
<e>image\_based</e>
<e>(linearly\_seperable)</e>
</l2st>
</list>
</frame>

15
data/007iterating.txt Normal file
View File

@ -0,0 +1,15 @@
<frame title="iterating">
<code>
for dataset in condition:
print(condition)
</code>
<l2st>
<e>\[annthyroid\]</e>
<e>\[breastw\]</e>
<e>\[cardio\]</e>
<e>\[...\]</e>
<e>\[Housing\_low\]</e>
</l2st>
</frame>

9
data/008iterating.txt Normal file
View File

@ -0,0 +1,9 @@
<frame title="iterating">
<code>
for dataset in condition:
x=dataset.getx()
y=dataset.gety()
</code>
</frame>

12
data/009pipeline.txt Normal file
View File

@ -0,0 +1,12 @@
<frame title="pipeline">
<code>
from yano.iter import *
for dataset, x,tx,ty in pipeline(condition,
split,
shuffle,
normalize("minmax")):
...
</code>
</frame>

16
data/010pipeline.txt Normal file
View File

@ -0,0 +1,16 @@
<frame title="pipeline">
<list>
<e>Again there are a couple modifiers possible</e>
<l2st>
<e>nonconst->remove constant features</e>
<e>shuffle</e>
<e>normalize('zscore'/'minmax')</e>
<e>cut(10)->at most 10 datasets</e>
<e>split->train test split, all anomalies in test set</e>
<e>crossval(5)->similar to split, but do multiple times (crossvalidation)</e>
</l2st>
<e>modifiers interact with each other</e>
<e>For example: normalize('minmax'), split</e>
<e>->train set always below 1, but no guarantees for the test set</e>
</list>
</frame>

View File

@ -0,0 +1,18 @@
<frame title="CrossValidation">
<list>
<e>Learned from DMC: Crossvalidation is important</e>
<e>Rarely found in Anomaly Detection, why?</e>
<e>A bit more complicated (not all samples are equal), but no reason why not</e>
<e>->So I implemented it into yano</e>
<l2st>
<e>folding only on normal data</e>
<e>How to handle anomalies?</e>
<e>If not folding them, cross-validation less useful</e>
<e>if folding them, often rare anomalies even more rare</e>
<e>->test set always 50\% anomalous</e>
<e>->Also improves simple evaluation metrics (accuracy)</e>
</l2st>
<e>Do you know a reason why Cross Validation is not common in AD?</e>
<e>Are there Problems with the way I fold my Anomalies?</e>
</list>
</frame>

17
data/012Logging.txt Normal file
View File

@ -0,0 +1,17 @@
<frame title="Logging">
<code>
from yano.logging import Logger
from pyod.models.iforest import IForest
from extended_iforest import train_extended_ifor
l=Logger({"IFor":IForest(n_estimators=100),
"eIFor":train_extended_ifor})
for dataset, folds in pipeline(condition,
crossval(5),
normalize("minmax"),
shuffle):
l.run_cross(dataset, folds)
latex=l.to_latex()
</code>
</frame>

9
data/013Seeding.txt Normal file
View File

@ -0,0 +1,9 @@
<frame title="Seeding">
<list>
<e>If you dont do anything, everything is seeded.</e>
<e>Makes rerunning a Model until the performance is good quite obvious.</e>
<e>But as every Run is seeded itself, this might induce bias.</e>
<e>Do you think this is worth it?</e>
<e>Are there any Problems with this?</e>
</list>
</frame>

24
data/014.txt Normal file
View File

@ -0,0 +1,24 @@
<frame >
\begin{tabular}{lll}
\hline
Dataset & eIFor & IFor \\
\hline
$pc3$ & $\textbf{0.7231} \pm 0.0153$ & $\textbf{0.7223} \pm 0.0178$ \\
$pima$ & $\textbf{0.7405} \pm 0.0110$ & $\textbf{0.7347} \pm 0.0126$ \\
$Diabetes\_present$ & $\textbf{0.7414} \pm 0.0195$ & $\textbf{0.7344} \pm 0.0242$ \\
$waveform-5000$ & $\textbf{0.7687} \pm 0.0123$ & $\textbf{0.7592} \pm 0.0206$ \\
$vowels$ & $\textbf{0.7843} \pm 0.0298$ & $\textbf{0.7753} \pm 0.0334$ \\
$Vowel\_0$ & $\textbf{0.8425} \pm 0.0698$ & $0.7193 \pm 0.0817$ \\
$Abalone\_1\_8$ & $\textbf{0.8525} \pm 0.0263$ & $0.8452 \pm 0.0257$ \\
$annthyroid$ & $0.8399 \pm 0.0135$ & $\textbf{0.9087} \pm 0.0090$ \\
$Vehicle\_van$ & $\textbf{0.8792} \pm 0.0265$ & $\textbf{0.8697} \pm 0.0383$ \\
$ionosphere$ & $\textbf{0.9320} \pm 0.0069$ & $0.9086 \pm 0.0142$ \\
$breastw$ & $\textbf{0.9948} \pm 0.0031$ & $\textbf{0.9952} \pm 0.0033$ \\
$segment$ & $\textbf{1.0}$ & $\textbf{0.9993} \pm 0.0015$ \\
$$ & $$ & $$ \\
$Average$ & $\textbf{0.8005}$ & $\textbf{0.7957}$ \\
\hline
\end{tabular}
</frame>

10
data/015statistics.txt Normal file
View File

@ -0,0 +1,10 @@
<frame title="statistics">
<list>
<e>Friedman test to see if there is a difference between models</e>
<e>Nemenyi test to see which models are equal, mark those equal to the maximum</e>
<e>For 2 models, Friedman not defined -> use Wilcoxon test</e>
<e>Does this match your expectation from the table?</e>
<e>Two models are 'equal' if their probability of being from the same distribution is #LessThan(p_b,p)#, what value should #Eq(p_b,0.1)# have?</e>
<e>Do I need to correct for p hacking (n experiments, so increase the difficulty for each, or is that clear from the table)</e>
</list>
</frame>

View File

@ -0,0 +1,24 @@
<section Experiments 1>
<repeat w="['ifor','eifor','qual']">
<frame title="Extended Isolation Forests">
<split>
<que>
<list>
<e>Isolation Forests are one algorithm for AD</e>
<e>Tries to isolate abnormal (rare) points instead of modelling normal ones</e>
<e>Creative approach->fairly successful (3000 Citations)</e>
<e>Many follow up papers</e>
<e>Extended Isolation Forest (Hariri et. al. 2018, 140 Citations)</e>
<e>Remove bias from the Isolation Forests</e>
<e>Also claim to improve their anomaly detection quality</e>
</list>
</que>
<que>
<i f="???" wmode="True"></i>
</que>
</split>
</frame>
</repeat>

22
data/017.txt Normal file
View File

@ -0,0 +1,22 @@
<frame >
\begin{tabular}{lll}
\hline
Dataset & eIFor & IFor \\
\hline
$Delft\_pump\_5x3\_noisy$ & $\textbf{0.3893} \pm 0.0345$ & $\textbf{0.4272} \pm 0.0680$ \\
$vertebral$ & $\textbf{0.4260} \pm 0.0111$ & $\textbf{0.4554} \pm 0.0416$ \\
$Liver\_1$ & $0.5367 \pm 0.0508$ & $\textbf{0.5474} \pm 0.0541$ \\
$Sonar\_mines$ & $\textbf{0.6882} \pm 0.1264$ & $0.6189 \pm 0.1301$ \\
$letter$ & $\textbf{0.6756} \pm 0.0119$ & $0.6471 \pm 0.0111$ \\
$Glass\_building\_float$ & $\textbf{0.6480} \pm 0.1012$ & $\textbf{0.6755} \pm 0.1117$ \\
$pc3$ & $\textbf{0.7231} \pm 0.0153$ & $\textbf{0.7223} \pm 0.0178$ \\
$pima$ & $\textbf{0.7405} \pm 0.0110$ & $\textbf{0.7347} \pm 0.0126$ \\
$Diabetes\_present$ & $\textbf{0.7414} \pm 0.0195$ & $\textbf{0.7344} \pm 0.0242$ \\
$waveform-5000$ & $\textbf{0.7687} \pm 0.0123$ & $\textbf{0.7592} \pm 0.0206$ \\
$steel-plates-fault$ & $\textbf{0.7735} \pm 0.0351$ & $\textbf{0.7682} \pm 0.0402$ \\
$vowels$ & $\textbf{0.7843} \pm 0.0298$ & $\textbf{0.7753} \pm 0.0334$ \\
\hline
\end{tabular}
</frame>

22
data/018.txt Normal file
View File

@ -0,0 +1,22 @@
<frame >
\begin{tabular}{lll}
\hline
Dataset & eIFor & IFor \\
\hline
$Vowel\_0$ & $\textbf{0.8425} \pm 0.0698$ & $0.7193 \pm 0.0817$ \\
$Housing\_low$ & $\textbf{0.7807} \pm 0.0333$ & $\textbf{0.7862} \pm 0.0336$ \\
$ozone-level-8hr$ & $\textbf{0.7904} \pm 0.0207$ & $\textbf{0.7768} \pm 0.0118$ \\
$Spectf\_0$ & $\textbf{0.8155} \pm 0.0255$ & $0.7535 \pm 0.0239$ \\
$HeartC$ & $0.7795 \pm 0.0258$ & $\textbf{0.8079} \pm 0.0255$ \\
$satellite$ & $\textbf{0.8125} \pm 0.0170$ & $\textbf{0.8103} \pm 0.0061$ \\
$optdigits$ & $\textbf{0.8099} \pm 0.0310$ & $\textbf{0.8142} \pm 0.0267$ \\
$spambase$ & $\textbf{0.8085} \pm 0.0110$ & $\textbf{0.8202} \pm 0.0042$ \\
$Abalone\_1\_8$ & $\textbf{0.8525} \pm 0.0263$ & $0.8452 \pm 0.0257$ \\
$qsar-biodeg$ & $\textbf{0.8584} \pm 0.0119$ & $\textbf{0.8628} \pm 0.0135$ \\
$annthyroid$ & $0.8399 \pm 0.0135$ & $\textbf{0.9087} \pm 0.0090$ \\
$Vehicle\_van$ & $\textbf{0.8792} \pm 0.0265$ & $\textbf{0.8697} \pm 0.0383$ \\
\hline
\end{tabular}
</frame>

21
data/019.txt Normal file
View File

@ -0,0 +1,21 @@
<frame >
\begin{tabular}{lll}
\hline
Dataset & eIFor & IFor \\
\hline
$ionosphere$ & $\textbf{0.9320} \pm 0.0069$ & $0.9086 \pm 0.0142$ \\
$page-blocks$ & $0.9189 \pm 0.0061$ & $\textbf{0.9299} \pm 0.0016$ \\
$Ecoli$ & $\textbf{0.9418} \pm 0.0292$ & $0.9192 \pm 0.0332$ \\
$cardio$ & $\textbf{0.9564} \pm 0.0043$ & $\textbf{0.9535} \pm 0.0036$ \\
$wbc$ & $\textbf{0.9611} \pm 0.0121$ & $\textbf{0.9607} \pm 0.0107$ \\
$pendigits$ & $\textbf{0.9641} \pm 0.0097$ & $\textbf{0.9652} \pm 0.0076$ \\
$thyroid$ & $0.9818 \pm 0.0024$ & $\textbf{0.9871} \pm 0.0025$ \\
$breastw$ & $\textbf{0.9948} \pm 0.0031$ & $\textbf{0.9952} \pm 0.0033$ \\
$segment$ & $\textbf{1.0}$ & $\textbf{0.9993} \pm 0.0015$ \\
$$ & $$ & $$ \\
$Average$ & $\textbf{0.8005} \pm 0.1458$ & $\textbf{0.7957} \pm 0.1431$ \\
\hline
\end{tabular}
</frame>

4
data/020highdim.txt Normal file
View File

@ -0,0 +1,4 @@
<section Experiments 2>
<frame title="highdim">
<i f="../prep/19highdim/a.png" wmode="True"></i>
</frame>

13
data/021New Condition.txt Normal file
View File

@ -0,0 +1,13 @@
<frame title="New Condition">
<code>
condition= (number_of_samples>200) &
(number_of_samples<10000) &
(number_of_features>50) &
(number_of_features<500) &
~index
print(len(condition),"Datasets found")
</code>
->13 Datasets found
</frame>

12
data/022New Models.txt Normal file
View File

@ -0,0 +1,12 @@
<frame title="New Models">
<code>
from pyod.models.iforest import IForest
from pyod.models.knn import KNN
from pyod.models.lof import LOF
l=Logger({"IFor":Iforest(n_estimators=100),
"Lof":LOF(),
"Knn": KNN()}, addfeat=True)
</code>
</frame>

25
data/023.txt Normal file
View File

@ -0,0 +1,25 @@
<frame >
\begin{tabular}{llll}
\hline
Dataset & Knn & Lof & IFor \\
\hline
$Delft\_pump\_5x3\_noisy(64)$ & $0.3800 \pm 0.0475$ & $0.3462 \pm 0.0327$ & $\textbf{0.4272} \pm 0.0680$ \\
$hill-valley(100)$ & $0.4744 \pm 0.0269$ & $\textbf{0.5060} \pm 0.0327$ & $0.4720 \pm 0.0288$ \\
$speech(400)$ & $0.4903 \pm 0.0103$ & $\textbf{0.5104} \pm 0.0115$ & $0.4872 \pm 0.0184$ \\
$Sonar\_mines(60)$ & $\textbf{0.7284} \pm 0.0939$ & $0.6769 \pm 0.0933$ & $0.6189 \pm 0.1301$ \\
$ozone-level-8hr(72)$ & $\textbf{0.8051} \pm 0.0288$ & $0.7738 \pm 0.0292$ & $\textbf{0.7768} \pm 0.0118$ \\
$spambase(57)$ & $0.8038 \pm 0.0125$ & $0.7712 \pm 0.0055$ & $\textbf{0.8202} \pm 0.0042$ \\
$arrhythmia(274)$ & $\textbf{0.8137} \pm 0.0185$ & $0.8042 \pm 0.0186$ & $\textbf{0.8086} \pm 0.0099$ \\
$mnist(100)$ & $0.9345 \pm 0.0039$ & $\textbf{0.9548} \pm 0.0037$ & $0.8732 \pm 0.0069$ \\
$Concordia3\_32(256)$ & $0.9246 \pm 0.0107$ & $\textbf{0.9486} \pm 0.0099$ & $\textbf{0.9322} \pm 0.0178$ \\
$optdigits(64)$ & $0.9966 \pm 0.0012$ & $\textbf{0.9975} \pm 0.0012$ & $0.8142 \pm 0.0267$ \\
$gas-drift(128)$ & $\textbf{0.9790} \pm 0.0018$ & $0.9585 \pm 0.0055$ & $0.8764 \pm 0.0166$ \\
$Delft\_pump\_AR(160)$ & $\textbf{0.9965}$ & $\textbf{0.9953} \pm 0.0019$ & $0.9665 \pm 0.0096$ \\
$musk(166)$ & $\textbf{1.0}$ & $\textbf{1.0}$ & $0.9808 \pm 0.0117$ \\
$$ & $$ & $$ & $$ \\
$Average$ & $\textbf{0.7944}$ & $\textbf{0.7879}$ & $0.7580$ \\
\hline
\end{tabular}
</frame>

11
data/024.txt Normal file
View File

@ -0,0 +1,11 @@
<frame >
<l2st>
<e>Hypothesis: Isolation Forests are better when there are numerical and nominal attributes</e>
<e>Easy to test</e>
</l2st>
<code>
condition=condition & (numeric & nominal)
</code>
</frame>

20
data/025.txt Normal file
View File

@ -0,0 +1,20 @@
<frame >
\begin{tabular}{llll}
\hline
Dataset & Knn & IFor & Lof \\
\hline
$ozone-level-8hr(72)$ & $\textbf{0.8051} \pm 0.0288$ & $\textbf{0.7768} \pm 0.0118$ & $0.7738 \pm 0.0292$ \\
$spambase(57)$ & $0.8038 \pm 0.0125$ & $\textbf{0.8202} \pm 0.0042$ & $0.7712 \pm 0.0055$ \\
$arrhythmia(274)$ & $\textbf{0.8137} \pm 0.0185$ & $\textbf{0.8086} \pm 0.0099$ & $0.8042 \pm 0.0186$ \\
$musk(166)$ & $\textbf{1.0}$ & $0.9808 \pm 0.0117$ & $\textbf{1.0}$ \\
$$ & $$ & $$ & $$ \\
$Average$ & $\textbf{0.8556}$ & $\textbf{0.8466}$ & $\textbf{0.8373}$ \\
\hline
\end{tabular}
<l2st>
<e>Only 4 datasets, so not clear at all</e>
<e>->More datasets</e>
</l2st>
</frame>

View File

@ -0,0 +1,12 @@
<section Experiments 3>
<frame title="Unsupervised Optimization">
<list>
<e>There are analysis that are only possible with many datasets</e>
<e>Here: unsupervised optimization</e>
<e>Given multiple AD models, find which is best:</e>
<e>Use AUC score? Requires Anomalies->Overfitting</e>
<e>Can you find an unsupervised Method?</e>
<e>In general very complicated, so here only focus on very small differences in the model.</e>
<e>So each model is an autoencoder, trained on the same dataset, where the difference is only in the initialisation</e>
</list>
</frame>

View File

@ -0,0 +1,20 @@
<repeat w="['page-blocks','pima']">
<frame title="Loss Optimization">
<split>
<que>
<list>
<e>First guess Loss of the Model on the training Data</e>
<e>How to evaluate this?</e>
<e>Train many models, look at the average AUC score.</e>
<e>For the alternative, take groups of 20 models, and look at the AUC score of the best model.</e>
<e>Is there a meaningfull difference between results? Give result as z\_score (#(m_1-m_2)/sqrt(s_1**2+s_2**2)#)</e>
<e>This difference depends a lot on the dataset</e>
<e>->even #LessThan(30,z)# does not mean much</e>
</list>
</que>
<que>
<i f="histone_???" wmode="True"></i>
</que>
</split>
</frame>
</repeat>

12
data/028loss.txt Normal file
View File

@ -0,0 +1,12 @@
<frame title="loss">
<split>
<que>
<list>
<e>Pick the Model with the lowest l2\-loss</e>
</list>
</que>
<que>
<i f="../prep/27loss/z_loss.pdf" wmode="True"></i>
</que>
</split>
</frame>

14
data/029Robustness.txt Normal file
View File

@ -0,0 +1,14 @@
<frame title="Robustness">
<split>
<que>
<list>
<e>Pick points with 1\% width difference in input space around each point.</e>
<e>for each point, find the maximum difference in output space.</e>
<e>average this difference</e>
</list>
</que>
<que>
<i f="../prep/28Robustness/z_robu.pdf" wmode="True"></i>
</que>
</split>
</frame>

View File

@ -0,0 +1,14 @@
<frame title="Distance Correlation">
<split>
<que>
<list>
<e>Pick random points in the input space.</e>
<e>measure the distance in input and output space</e>
<e>a low correlation is a good model</e>
</list>
</que>
<que>
<i f="../prep/29Distance_Correlation/z_dist.pdf" wmode="True"></i>
</que>
</split>
</frame>

12
data/031Other.txt Normal file
View File

@ -0,0 +1,12 @@
<section Conclusion>
<frame title="Other">
<list>
<e>Things I still want to add:</e>
<l2st>
<e>Ensemble Methods</e>
<e>Visualisation options</e>
<e>Alternative Evaluations</e>
<e>Hyperparameter optimisation (with crossvalidation)</e>
</l2st>
</list>
</frame>

7
data/032Feedback.txt Normal file
View File

@ -0,0 +1,7 @@
<frame title="Feedback">
<list>
<e>What do you think about this?</e>
<e>Is there something I should also add?</e>
<e>What would you need for you to actually use this?</e>
</list>
</frame>

12
general.txt Normal file
View File

@ -0,0 +1,12 @@
<plt>
<name Current experiment status>
<title pip install yano>
<stitle yano>
<institute ls9 tu Dortmund>
<theme CambridgeUS>
<colo dolphin>
</plt>

1
imgs Submodule

@ -0,0 +1 @@
Subproject commit 62ffd6ae589d7983791feea9d44d7658534d54a0

3
out/compile.bat Normal file
View File

@ -0,0 +1,3 @@
pdflatex main.tex
pdflatex main.tex

3
out/compile.sh Executable file
View File

@ -0,0 +1,3 @@
pdflatex main.tex
pdflatex main.tex

127
out/label.json Normal file
View File

@ -0,0 +1,127 @@
[
{
"typ": "img",
"files": [
"../prep/04yano/a.png"
],
"label": "prep04yanoapng",
"caption": "",
"where": "../yano//data/004yano.txt"
},
{
"typ": "section",
"title": "Basics",
"label": "Basics",
"file": "../yano//data/005selector.txt",
"issec": true
},
{
"typ": "section",
"title": "Experiments 1",
"label": "Experiments 1",
"file": "../yano//data/016Extended Isolation Forests.txt",
"issec": true
},
{
"typ": "img",
"files": [
"../imgs/ifor"
],
"label": "ifor",
"caption": "",
"where": "../yano//data/016Extended Isolation Forests.txt"
},
{
"typ": "img",
"files": [
"../imgs/eifor"
],
"label": "eifor",
"caption": "",
"where": "../yano//data/016Extended Isolation Forests.txt"
},
{
"typ": "img",
"files": [
"../imgs/qual"
],
"label": "qual",
"caption": "",
"where": "../yano//data/016Extended Isolation Forests.txt"
},
{
"typ": "section",
"title": "Experiments 2",
"label": "Experiments 2",
"file": "../yano//data/020highdim.txt",
"issec": true
},
{
"typ": "img",
"files": [
"../prep/19highdim/a.png"
],
"label": "prep19highdimapng",
"caption": "",
"where": "../yano//data/020highdim.txt"
},
{
"typ": "section",
"title": "Experiments 3",
"label": "Experiments 3",
"file": "../yano//data/026Unsupervised Optimization.txt",
"issec": true
},
{
"typ": "img",
"files": [
"../imgs/histone_page-blocks"
],
"label": "histone_page-blocks",
"caption": "",
"where": "../yano//data/027Loss Optimization.txt"
},
{
"typ": "img",
"files": [
"../imgs/histone_pima"
],
"label": "histone_pima",
"caption": "",
"where": "../yano//data/027Loss Optimization.txt"
},
{
"typ": "img",
"files": [
"../prep/27loss/z_loss.pdf"
],
"label": "prep27lossz_losspdf",
"caption": "",
"where": "../yano//data/028loss.txt"
},
{
"typ": "img",
"files": [
"../prep/28Robustness/z_robu.pdf"
],
"label": "prep28Robustnessz_robupdf",
"caption": "",
"where": "../yano//data/029Robustness.txt"
},
{
"typ": "img",
"files": [
"../prep/29Distance_Correlation/z_dist.pdf"
],
"label": "prep29Distance_Correlationz_distpdf",
"caption": "",
"where": "../yano//data/030Distance Correlation.txt"
},
{
"typ": "section",
"title": "Conclusion",
"label": "Conclusion",
"file": "../yano//data/031Other.txt",
"issec": true
}
]

258
out/main.aux Normal file
View File

@ -0,0 +1,258 @@
\relax
\providecommand\hyper@newdestlabel[2]{}
\providecommand\HyperFirstAtBeginDocument{\AtBeginDocument}
\HyperFirstAtBeginDocument{\ifx\hyper@anchor\@undefined
\global\let\oldcontentsline\contentsline
\gdef\contentsline#1#2#3#4{\oldcontentsline{#1}{#2}{#3}}
\global\let\oldnewlabel\newlabel
\gdef\newlabel#1#2{\newlabelxx{#1}#2}
\gdef\newlabelxx#1#2#3#4#5#6{\oldnewlabel{#1}{{#2}{#3}}}
\AtEndDocument{\ifx\hyper@anchor\@undefined
\let\contentsline\oldcontentsline
\let\newlabel\oldnewlabel
\fi}
\fi}
\global\let\hyper@last\relax
\gdef\HyperFirstAtBeginDocument#1{#1}
\providecommand\HyField@AuxAddToFields[1]{}
\providecommand\HyField@AuxAddToCoFields[2]{}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{1}{1/1}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {1}{1}}}
\newlabel{Problem<1>}{{2}{2}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Problem<1>}{2}}
\newlabel{Problem}{{2}{2}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Problem}{2}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{2}{2/2}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {2}{2}}}
\newlabel{Students<1>}{{3}{3}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Students<1>}{3}}
\newlabel{Students}{{3}{3}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Students}{3}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{3}{3/3}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {3}{3}}}
\newlabel{yano<1>}{{4}{4}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {yano<1>}{4}}
\newlabel{yano}{{4}{4}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {yano}{4}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{4}{4/4}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {4}{4}}}
\newlabel{yano<1>}{{5}{5}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {yano<1>}{5}}
\newlabel{yano}{{5}{5}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {yano}{5}}
\newlabel{fig:prep04yanoapng}{{5}{5}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep04yanoapng}{5}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{5}{5/5}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {5}{5}}}
\@writefile{toc}{\beamer@sectionintoc {1}{Basics}{6}{0}{1}}
\@writefile{nav}{\headcommand {\beamer@sectionpages {1}{5}}}
\@writefile{nav}{\headcommand {\beamer@subsectionpages {1}{5}}}
\@writefile{nav}{\headcommand {\sectionentry {1}{Basics}{6}{Basics}{0}}}
\newlabel{sec:Basics}{{1}{6}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {sec:Basics}{6}}
\newlabel{selector<1>}{{6}{6}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {selector<1>}{6}}
\newlabel{selector}{{6}{6}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {selector}{6}}
\@writefile{nav}{\headcommand {\slideentry {1}{0}{1}{6/6}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {6}{6}}}
\newlabel{selectors<1>}{{7}{7}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {selectors<1>}{7}}
\newlabel{selectors}{{7}{7}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {selectors}{7}}
\@writefile{nav}{\headcommand {\slideentry {1}{0}{2}{7/7}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {7}{7}}}
\newlabel{iterating<1>}{{8}{8}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {iterating<1>}{8}}
\newlabel{iterating}{{8}{8}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {iterating}{8}}
\@writefile{nav}{\headcommand {\slideentry {1}{0}{3}{8/8}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {8}{8}}}
\newlabel{iterating<1>}{{9}{9}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {iterating<1>}{9}}
\newlabel{iterating}{{9}{9}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {iterating}{9}}
\@writefile{nav}{\headcommand {\slideentry {1}{0}{4}{9/9}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {9}{9}}}
\newlabel{pipeline<1>}{{10}{10}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {pipeline<1>}{10}}
\newlabel{pipeline}{{10}{10}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {pipeline}{10}}
\@writefile{nav}{\headcommand {\slideentry {1}{0}{5}{10/10}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {10}{10}}}
\newlabel{pipeline<1>}{{11}{11}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {pipeline<1>}{11}}
\newlabel{pipeline}{{11}{11}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {pipeline}{11}}
\@writefile{nav}{\headcommand {\slideentry {1}{0}{6}{11/11}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {11}{11}}}
\newlabel{CrossValidation<1>}{{12}{12}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {CrossValidation<1>}{12}}
\newlabel{CrossValidation}{{12}{12}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {CrossValidation}{12}}
\@writefile{nav}{\headcommand {\slideentry {1}{0}{7}{12/12}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {12}{12}}}
\newlabel{Logging<1>}{{13}{13}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Logging<1>}{13}}
\newlabel{Logging}{{13}{13}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Logging}{13}}
\@writefile{nav}{\headcommand {\slideentry {1}{0}{8}{13/13}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {13}{13}}}
\newlabel{Seeding<1>}{{14}{14}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Seeding<1>}{14}}
\newlabel{Seeding}{{14}{14}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Seeding}{14}}
\@writefile{nav}{\headcommand {\slideentry {1}{0}{9}{14/14}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {14}{14}}}
\@writefile{nav}{\headcommand {\slideentry {1}{0}{10}{15/15}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {15}{15}}}
\newlabel{statistics<1>}{{16}{16}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {statistics<1>}{16}}
\newlabel{statistics}{{16}{16}{Basics}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {statistics}{16}}
\@writefile{nav}{\headcommand {\slideentry {1}{0}{11}{16/16}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {16}{16}}}
\@writefile{toc}{\beamer@sectionintoc {2}{Experiments 1}{17}{0}{2}}
\@writefile{nav}{\headcommand {\beamer@sectionpages {6}{16}}}
\@writefile{nav}{\headcommand {\beamer@subsectionpages {6}{16}}}
\@writefile{nav}{\headcommand {\sectionentry {2}{Experiments 1}{17}{Experiments 1}{0}}}
\newlabel{sec:Experiments 1}{{2}{17}{Experiments 1}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {sec:Experiments 1}{17}}
\newlabel{Extended Isolation Forests<1>}{{17}{17}{Experiments 1}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Extended Isolation Forests<1>}{17}}
\newlabel{Extended Isolation Forests}{{17}{17}{Experiments 1}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Extended Isolation Forests}{17}}
\newlabel{fig:ifor}{{17}{17}{Experiments 1}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:ifor}{17}}
\@writefile{nav}{\headcommand {\slideentry {2}{0}{1}{17/17}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {17}{17}}}
\newlabel{Extended Isolation Forests<1>}{{18}{18}{Experiments 1}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Extended Isolation Forests<1>}{18}}
\newlabel{Extended Isolation Forests}{{18}{18}{Experiments 1}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Extended Isolation Forests}{18}}
\newlabel{fig:eifor}{{18}{18}{Experiments 1}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:eifor}{18}}
\@writefile{nav}{\headcommand {\slideentry {2}{0}{2}{18/18}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {18}{18}}}
\newlabel{Extended Isolation Forests<1>}{{19}{19}{Experiments 1}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Extended Isolation Forests<1>}{19}}
\newlabel{Extended Isolation Forests}{{19}{19}{Experiments 1}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Extended Isolation Forests}{19}}
\newlabel{fig:qual}{{19}{19}{Experiments 1}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:qual}{19}}
\@writefile{nav}{\headcommand {\slideentry {2}{0}{3}{19/19}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {19}{19}}}
\@writefile{nav}{\headcommand {\slideentry {2}{0}{4}{20/20}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {20}{20}}}
\@writefile{nav}{\headcommand {\slideentry {2}{0}{5}{21/21}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {21}{21}}}
\@writefile{nav}{\headcommand {\slideentry {2}{0}{6}{22/22}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {22}{22}}}
\@writefile{toc}{\beamer@sectionintoc {3}{Experiments 2}{23}{0}{3}}
\@writefile{nav}{\headcommand {\beamer@sectionpages {17}{22}}}
\@writefile{nav}{\headcommand {\beamer@subsectionpages {17}{22}}}
\@writefile{nav}{\headcommand {\sectionentry {3}{Experiments 2}{23}{Experiments 2}{0}}}
\newlabel{sec:Experiments 2}{{3}{23}{Experiments 2}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {sec:Experiments 2}{23}}
\newlabel{highdim<1>}{{23}{23}{Experiments 2}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {highdim<1>}{23}}
\newlabel{highdim}{{23}{23}{Experiments 2}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {highdim}{23}}
\newlabel{fig:prep19highdimapng}{{23}{23}{Experiments 2}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep19highdimapng}{23}}
\@writefile{nav}{\headcommand {\slideentry {3}{0}{1}{23/23}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {23}{23}}}
\newlabel{New Condition<1>}{{24}{24}{Experiments 2}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {New Condition<1>}{24}}
\newlabel{New Condition}{{24}{24}{Experiments 2}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {New Condition}{24}}
\@writefile{nav}{\headcommand {\slideentry {3}{0}{2}{24/24}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {24}{24}}}
\newlabel{New Models<1>}{{25}{25}{Experiments 2}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {New Models<1>}{25}}
\newlabel{New Models}{{25}{25}{Experiments 2}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {New Models}{25}}
\@writefile{nav}{\headcommand {\slideentry {3}{0}{3}{25/25}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {25}{25}}}
\@writefile{nav}{\headcommand {\slideentry {3}{0}{4}{26/26}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {26}{26}}}
\@writefile{nav}{\headcommand {\slideentry {3}{0}{5}{27/27}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {27}{27}}}
\@writefile{nav}{\headcommand {\slideentry {3}{0}{6}{28/28}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {28}{28}}}
\@writefile{toc}{\beamer@sectionintoc {4}{Experiments 3}{29}{0}{4}}
\@writefile{nav}{\headcommand {\beamer@sectionpages {23}{28}}}
\@writefile{nav}{\headcommand {\beamer@subsectionpages {23}{28}}}
\@writefile{nav}{\headcommand {\sectionentry {4}{Experiments 3}{29}{Experiments 3}{0}}}
\newlabel{sec:Experiments 3}{{4}{29}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {sec:Experiments 3}{29}}
\newlabel{Unsupervised Optimization<1>}{{29}{29}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Unsupervised Optimization<1>}{29}}
\newlabel{Unsupervised Optimization}{{29}{29}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Unsupervised Optimization}{29}}
\@writefile{nav}{\headcommand {\slideentry {4}{0}{1}{29/29}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {29}{29}}}
\newlabel{Loss Optimization<1>}{{30}{30}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Loss Optimization<1>}{30}}
\newlabel{Loss Optimization}{{30}{30}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Loss Optimization}{30}}
\newlabel{fig:histone_page-blocks}{{30}{30}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:histone_page-blocks}{30}}
\@writefile{nav}{\headcommand {\slideentry {4}{0}{2}{30/30}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {30}{30}}}
\newlabel{Loss Optimization<1>}{{31}{31}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Loss Optimization<1>}{31}}
\newlabel{Loss Optimization}{{31}{31}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Loss Optimization}{31}}
\newlabel{fig:histone_pima}{{31}{31}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:histone_pima}{31}}
\@writefile{nav}{\headcommand {\slideentry {4}{0}{3}{31/31}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {31}{31}}}
\newlabel{loss<1>}{{32}{32}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {loss<1>}{32}}
\newlabel{loss}{{32}{32}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {loss}{32}}
\newlabel{fig:prep27lossz_losspdf}{{32}{32}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep27lossz_losspdf}{32}}
\@writefile{nav}{\headcommand {\slideentry {4}{0}{4}{32/32}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {32}{32}}}
\newlabel{Robustness<1>}{{33}{33}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Robustness<1>}{33}}
\newlabel{Robustness}{{33}{33}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Robustness}{33}}
\newlabel{fig:prep28Robustnessz_robupdf}{{33}{33}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep28Robustnessz_robupdf}{33}}
\@writefile{nav}{\headcommand {\slideentry {4}{0}{5}{33/33}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {33}{33}}}
\newlabel{Distance Correlation<1>}{{34}{34}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Distance Correlation<1>}{34}}
\newlabel{Distance Correlation}{{34}{34}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Distance Correlation}{34}}
\newlabel{fig:prep29Distance_Correlationz_distpdf}{{34}{34}{Experiments 3}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep29Distance_Correlationz_distpdf}{34}}
\@writefile{nav}{\headcommand {\slideentry {4}{0}{6}{34/34}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {34}{34}}}
\@writefile{toc}{\beamer@sectionintoc {5}{Conclusion}{35}{0}{5}}
\@writefile{nav}{\headcommand {\beamer@sectionpages {29}{34}}}
\@writefile{nav}{\headcommand {\beamer@subsectionpages {29}{34}}}
\@writefile{nav}{\headcommand {\sectionentry {5}{Conclusion}{35}{Conclusion}{0}}}
\newlabel{sec:Conclusion}{{5}{35}{Conclusion}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {sec:Conclusion}{35}}
\newlabel{Other<1>}{{35}{35}{Conclusion}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Other<1>}{35}}
\newlabel{Other}{{35}{35}{Conclusion}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Other}{35}}
\@writefile{nav}{\headcommand {\slideentry {5}{0}{1}{35/35}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {35}{35}}}
\newlabel{Feedback<1>}{{36}{36}{Conclusion}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Feedback<1>}{36}}
\newlabel{Feedback}{{36}{36}{Conclusion}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Feedback}{36}}
\@writefile{nav}{\headcommand {\slideentry {5}{0}{2}{36/36}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {36}{36}}}
\@writefile{nav}{\headcommand {\beamer@partpages {1}{36}}}
\@writefile{nav}{\headcommand {\beamer@subsectionpages {35}{36}}}
\@writefile{nav}{\headcommand {\beamer@sectionpages {35}{36}}}
\@writefile{nav}{\headcommand {\beamer@documentpages {36}}}
\@writefile{nav}{\headcommand {\gdef \inserttotalframenumber {36}}}
\gdef \@abspage@last{36}

1331
out/main.log Normal file

File diff suppressed because it is too large Load Diff

92
out/main.nav Normal file
View File

@ -0,0 +1,92 @@
\headcommand {\slideentry {0}{0}{1}{1/1}{}{0}}
\headcommand {\beamer@framepages {1}{1}}
\headcommand {\slideentry {0}{0}{2}{2/2}{}{0}}
\headcommand {\beamer@framepages {2}{2}}
\headcommand {\slideentry {0}{0}{3}{3/3}{}{0}}
\headcommand {\beamer@framepages {3}{3}}
\headcommand {\slideentry {0}{0}{4}{4/4}{}{0}}
\headcommand {\beamer@framepages {4}{4}}
\headcommand {\slideentry {0}{0}{5}{5/5}{}{0}}
\headcommand {\beamer@framepages {5}{5}}
\headcommand {\beamer@sectionpages {1}{5}}
\headcommand {\beamer@subsectionpages {1}{5}}
\headcommand {\sectionentry {1}{Basics}{6}{Basics}{0}}
\headcommand {\slideentry {1}{0}{1}{6/6}{}{0}}
\headcommand {\beamer@framepages {6}{6}}
\headcommand {\slideentry {1}{0}{2}{7/7}{}{0}}
\headcommand {\beamer@framepages {7}{7}}
\headcommand {\slideentry {1}{0}{3}{8/8}{}{0}}
\headcommand {\beamer@framepages {8}{8}}
\headcommand {\slideentry {1}{0}{4}{9/9}{}{0}}
\headcommand {\beamer@framepages {9}{9}}
\headcommand {\slideentry {1}{0}{5}{10/10}{}{0}}
\headcommand {\beamer@framepages {10}{10}}
\headcommand {\slideentry {1}{0}{6}{11/11}{}{0}}
\headcommand {\beamer@framepages {11}{11}}
\headcommand {\slideentry {1}{0}{7}{12/12}{}{0}}
\headcommand {\beamer@framepages {12}{12}}
\headcommand {\slideentry {1}{0}{8}{13/13}{}{0}}
\headcommand {\beamer@framepages {13}{13}}
\headcommand {\slideentry {1}{0}{9}{14/14}{}{0}}
\headcommand {\beamer@framepages {14}{14}}
\headcommand {\slideentry {1}{0}{10}{15/15}{}{0}}
\headcommand {\beamer@framepages {15}{15}}
\headcommand {\slideentry {1}{0}{11}{16/16}{}{0}}
\headcommand {\beamer@framepages {16}{16}}
\headcommand {\beamer@sectionpages {6}{16}}
\headcommand {\beamer@subsectionpages {6}{16}}
\headcommand {\sectionentry {2}{Experiments 1}{17}{Experiments 1}{0}}
\headcommand {\slideentry {2}{0}{1}{17/17}{}{0}}
\headcommand {\beamer@framepages {17}{17}}
\headcommand {\slideentry {2}{0}{2}{18/18}{}{0}}
\headcommand {\beamer@framepages {18}{18}}
\headcommand {\slideentry {2}{0}{3}{19/19}{}{0}}
\headcommand {\beamer@framepages {19}{19}}
\headcommand {\slideentry {2}{0}{4}{20/20}{}{0}}
\headcommand {\beamer@framepages {20}{20}}
\headcommand {\slideentry {2}{0}{5}{21/21}{}{0}}
\headcommand {\beamer@framepages {21}{21}}
\headcommand {\slideentry {2}{0}{6}{22/22}{}{0}}
\headcommand {\beamer@framepages {22}{22}}
\headcommand {\beamer@sectionpages {17}{22}}
\headcommand {\beamer@subsectionpages {17}{22}}
\headcommand {\sectionentry {3}{Experiments 2}{23}{Experiments 2}{0}}
\headcommand {\slideentry {3}{0}{1}{23/23}{}{0}}
\headcommand {\beamer@framepages {23}{23}}
\headcommand {\slideentry {3}{0}{2}{24/24}{}{0}}
\headcommand {\beamer@framepages {24}{24}}
\headcommand {\slideentry {3}{0}{3}{25/25}{}{0}}
\headcommand {\beamer@framepages {25}{25}}
\headcommand {\slideentry {3}{0}{4}{26/26}{}{0}}
\headcommand {\beamer@framepages {26}{26}}
\headcommand {\slideentry {3}{0}{5}{27/27}{}{0}}
\headcommand {\beamer@framepages {27}{27}}
\headcommand {\slideentry {3}{0}{6}{28/28}{}{0}}
\headcommand {\beamer@framepages {28}{28}}
\headcommand {\beamer@sectionpages {23}{28}}
\headcommand {\beamer@subsectionpages {23}{28}}
\headcommand {\sectionentry {4}{Experiments 3}{29}{Experiments 3}{0}}
\headcommand {\slideentry {4}{0}{1}{29/29}{}{0}}
\headcommand {\beamer@framepages {29}{29}}
\headcommand {\slideentry {4}{0}{2}{30/30}{}{0}}
\headcommand {\beamer@framepages {30}{30}}
\headcommand {\slideentry {4}{0}{3}{31/31}{}{0}}
\headcommand {\beamer@framepages {31}{31}}
\headcommand {\slideentry {4}{0}{4}{32/32}{}{0}}
\headcommand {\beamer@framepages {32}{32}}
\headcommand {\slideentry {4}{0}{5}{33/33}{}{0}}
\headcommand {\beamer@framepages {33}{33}}
\headcommand {\slideentry {4}{0}{6}{34/34}{}{0}}
\headcommand {\beamer@framepages {34}{34}}
\headcommand {\beamer@sectionpages {29}{34}}
\headcommand {\beamer@subsectionpages {29}{34}}
\headcommand {\sectionentry {5}{Conclusion}{35}{Conclusion}{0}}
\headcommand {\slideentry {5}{0}{1}{35/35}{}{0}}
\headcommand {\beamer@framepages {35}{35}}
\headcommand {\slideentry {5}{0}{2}{36/36}{}{0}}
\headcommand {\beamer@framepages {36}{36}}
\headcommand {\beamer@partpages {1}{36}}
\headcommand {\beamer@subsectionpages {35}{36}}
\headcommand {\beamer@sectionpages {35}{36}}
\headcommand {\beamer@documentpages {36}}
\headcommand {\gdef \inserttotalframenumber {36}}

5
out/main.out Normal file
View File

@ -0,0 +1,5 @@
\BOOKMARK [2][]{Outline0.1}{\376\377\000B\000a\000s\000i\000c\000s}{}% 1
\BOOKMARK [2][]{Outline0.2}{\376\377\000E\000x\000p\000e\000r\000i\000m\000e\000n\000t\000s\000\040\0001}{}% 2
\BOOKMARK [2][]{Outline0.3}{\376\377\000E\000x\000p\000e\000r\000i\000m\000e\000n\000t\000s\000\040\0002}{}% 3
\BOOKMARK [2][]{Outline0.4}{\376\377\000E\000x\000p\000e\000r\000i\000m\000e\000n\000t\000s\000\040\0003}{}% 4
\BOOKMARK [2][]{Outline0.5}{\376\377\000C\000o\000n\000c\000l\000u\000s\000i\000o\000n}{}% 5

BIN
out/main.pdf Normal file

Binary file not shown.

71
out/main.snm Normal file
View File

@ -0,0 +1,71 @@
\beamer@slide {Problem<1>}{2}
\beamer@slide {Problem}{2}
\beamer@slide {Students<1>}{3}
\beamer@slide {Students}{3}
\beamer@slide {yano<1>}{4}
\beamer@slide {yano}{4}
\beamer@slide {yano<1>}{5}
\beamer@slide {yano}{5}
\beamer@slide {fig:prep04yanoapng}{5}
\beamer@slide {sec:Basics}{6}
\beamer@slide {selector<1>}{6}
\beamer@slide {selector}{6}
\beamer@slide {selectors<1>}{7}
\beamer@slide {selectors}{7}
\beamer@slide {iterating<1>}{8}
\beamer@slide {iterating}{8}
\beamer@slide {iterating<1>}{9}
\beamer@slide {iterating}{9}
\beamer@slide {pipeline<1>}{10}
\beamer@slide {pipeline}{10}
\beamer@slide {pipeline<1>}{11}
\beamer@slide {pipeline}{11}
\beamer@slide {CrossValidation<1>}{12}
\beamer@slide {CrossValidation}{12}
\beamer@slide {Logging<1>}{13}
\beamer@slide {Logging}{13}
\beamer@slide {Seeding<1>}{14}
\beamer@slide {Seeding}{14}
\beamer@slide {statistics<1>}{16}
\beamer@slide {statistics}{16}
\beamer@slide {sec:Experiments 1}{17}
\beamer@slide {Extended Isolation Forests<1>}{17}
\beamer@slide {Extended Isolation Forests}{17}
\beamer@slide {fig:ifor}{17}
\beamer@slide {Extended Isolation Forests<1>}{18}
\beamer@slide {Extended Isolation Forests}{18}
\beamer@slide {fig:eifor}{18}
\beamer@slide {Extended Isolation Forests<1>}{19}
\beamer@slide {Extended Isolation Forests}{19}
\beamer@slide {fig:qual}{19}
\beamer@slide {sec:Experiments 2}{23}
\beamer@slide {highdim<1>}{23}
\beamer@slide {highdim}{23}
\beamer@slide {fig:prep19highdimapng}{23}
\beamer@slide {New Condition<1>}{24}
\beamer@slide {New Condition}{24}
\beamer@slide {New Models<1>}{25}
\beamer@slide {New Models}{25}
\beamer@slide {sec:Experiments 3}{29}
\beamer@slide {Unsupervised Optimization<1>}{29}
\beamer@slide {Unsupervised Optimization}{29}
\beamer@slide {Loss Optimization<1>}{30}
\beamer@slide {Loss Optimization}{30}
\beamer@slide {fig:histone_page-blocks}{30}
\beamer@slide {Loss Optimization<1>}{31}
\beamer@slide {Loss Optimization}{31}
\beamer@slide {fig:histone_pima}{31}
\beamer@slide {loss<1>}{32}
\beamer@slide {loss}{32}
\beamer@slide {fig:prep27lossz_losspdf}{32}
\beamer@slide {Robustness<1>}{33}
\beamer@slide {Robustness}{33}
\beamer@slide {fig:prep28Robustnessz_robupdf}{33}
\beamer@slide {Distance Correlation<1>}{34}
\beamer@slide {Distance Correlation}{34}
\beamer@slide {fig:prep29Distance_Correlationz_distpdf}{34}
\beamer@slide {sec:Conclusion}{35}
\beamer@slide {Other<1>}{35}
\beamer@slide {Other}{35}
\beamer@slide {Feedback<1>}{36}
\beamer@slide {Feedback}{36}

1116
out/main.tex Normal file

File diff suppressed because it is too large Load Diff

5
out/main.toc Normal file
View File

@ -0,0 +1,5 @@
\beamer@sectionintoc {1}{Basics}{6}{0}{1}
\beamer@sectionintoc {2}{Experiments 1}{17}{0}{2}
\beamer@sectionintoc {3}{Experiments 2}{23}{0}{3}
\beamer@sectionintoc {4}{Experiments 3}{29}{0}{4}
\beamer@sectionintoc {5}{Conclusion}{35}{0}{5}

0
prep/000/nonl Normal file
View File

1
prep/000/q Normal file
View File

@ -0,0 +1 @@
<titlepage>

10
prep/01Problem/q Normal file
View File

@ -0,0 +1,10 @@
Paper with Benedikt
require multiple very specific datasets
<l2st>
many but not to many features
at least some samples (for the NN)
Only numerical attributes best
specific quality
unrelated datasets
</l2st>
Requires you to search for many datasets and filter them

6
prep/02Students/q Normal file
View File

@ -0,0 +1,6 @@
Not clear what you can use
Many different formats
train/test splits
So for Students I just do this work and send them archives directly
->Not a good solution

8
prep/03yano/q Normal file
View File

@ -0,0 +1,8 @@
So I have been packaging all my scripts
I had surprisingly much fun doing this
<l2st>
More than just standard functions
A couple of weird decisions
And this will likely grow further
</l2st>
->So I would like to discuss some parts with you and maybe you even have more features you might want

BIN
prep/04yano/a.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

6
prep/04yano/q Normal file
View File

@ -0,0 +1,6 @@
Simply install it over pip
Contains 187 real-World Datasets
->biggest library of datasets explicitely for anomaly detection
not yet happy with this
especially only mostly contains numerical and nominal attributes
->few categorical and no time-series attributes

0
prep/05selector/nonl Normal file
View File

17
prep/05selector/q Normal file
View File

@ -0,0 +1,17 @@
<code>
import yano
from yano.symbols import *
condition= (number_of_features>5) &
(number_of_features<100) &
(number_of_samples>100) &
(number_of_samples<10000) &
(number_of_samples>2*number_of_features) &
~index
print(len(condition), "Datasets found")
</code>
->33 Datasets found

26
prep/06selectors/q Normal file
View File

@ -0,0 +1,26 @@
Lots of symbols like this
<l2st>
name
number\_of\_features
number\_of\_samples
index (correlated datasets)
</l2st>
Feature types
<l2st>
numeric
nominal
categorical
(textual)
</l2st>
Count based
<l2st>
number\_anomalies
number\_normals
fraction\_anomalies
</l2st>
Specific ones
<l2st>
image\_based
(linearly\_seperable)
</l2st>

0
prep/07iterating/nonl Normal file
View File

14
prep/07iterating/q Normal file
View File

@ -0,0 +1,14 @@
<code>
for dataset in condition:
print(condition)
</code>
<l2st>
<e>\[annthyroid\]</e>
<e>\[breastw\]</e>
<e>\[cardio\]</e>
<e>\[...\]</e>
<e>\[Housing\_low\]</e>
</l2st>

0
prep/08iterating/nonl Normal file
View File

8
prep/08iterating/q Normal file
View File

@ -0,0 +1,8 @@
<code>
for dataset in condition:
x=dataset.getx()
y=dataset.gety()
</code>

0
prep/09pipeline/nonl Normal file
View File

15
prep/09pipeline/q Normal file
View File

@ -0,0 +1,15 @@
<code>
from yano.iter import *
for dataset, x,tx,ty in pipeline(condition,
split,
shuffle,
normalize("minmax")):
...
</code>

12
prep/10pipeline/q Normal file
View File

@ -0,0 +1,12 @@
Again there are a couple modifiers possible
<l2st>
nonconst->remove constant features
shuffle
normalize('zscore'/'minmax')
cut(10)->at most 10 datasets
split->train test split, all anomalies in test set
crossval(5)->similar to split, but do multiple times (crossvalidation)
</l2st>
modifiers interact with each other
For example: normalize('minmax'), split
->train set always below 1, but no guarantees for the test set

14
prep/11CrossValidation/q Normal file
View File

@ -0,0 +1,14 @@
Learned from DMC: Crossvalidation is important
Rarely found in Anomaly Detection, why?
A bit more complicated (not all samples are equal), but no reason why not
->So I implemented it into yano
<l2st>
folding only on normal data
How to handle anomalies?
If not folding them, cross-validation less useful
if folding them, often rare anomalies even more rare
->test set always 50\% anomalous
->Also improves simple evaluation metrics (accuracy)
</l2st>
Do you know a reason why Cross Validation is not common in AD?
Are there Problems with the way I fold my Anomalies?

0
prep/12Logging/nonl Normal file
View File

21
prep/12Logging/q Normal file
View File

@ -0,0 +1,21 @@
<code>
from yano.logging import Logger
from pyod.models.iforest import IForest
from extended_iforest import train_extended_ifor
l=Logger({"IFor":IForest(n_estimators=100),
"eIFor":train_extended_ifor})
for dataset, folds in pipeline(condition,
crossval(5),
normalize("minmax"),
shuffle):
l.run_cross(dataset, folds)
latex=l.to_latex()
</code>

5
prep/12Seeding/q Normal file
View File

@ -0,0 +1,5 @@
If you dont do anything, everything is seeded.
Makes rerunning a Model until the performance is good quite obvious.
But as every Run is seeded itself, this might induce bias.
Do you think this is worth it?
Are there any Problems with this?

0
prep/13/nonl Normal file
View File

21
prep/13/q Normal file
View File

@ -0,0 +1,21 @@
\begin{tabular}{lll}
\hline
Dataset & eIFor & IFor \\
\hline
$pc3$ & $\textbf{0.7231} \pm 0.0153$ & $\textbf{0.7223} \pm 0.0178$ \\
$pima$ & $\textbf{0.7405} \pm 0.0110$ & $\textbf{0.7347} \pm 0.0126$ \\
$Diabetes\_present$ & $\textbf{0.7414} \pm 0.0195$ & $\textbf{0.7344} \pm 0.0242$ \\
$waveform-5000$ & $\textbf{0.7687} \pm 0.0123$ & $\textbf{0.7592} \pm 0.0206$ \\
$vowels$ & $\textbf{0.7843} \pm 0.0298$ & $\textbf{0.7753} \pm 0.0334$ \\
$Vowel\_0$ & $\textbf{0.8425} \pm 0.0698$ & $0.7193 \pm 0.0817$ \\
$Abalone\_1\_8$ & $\textbf{0.8525} \pm 0.0263$ & $0.8452 \pm 0.0257$ \\
$annthyroid$ & $0.8399 \pm 0.0135$ & $\textbf{0.9087} \pm 0.0090$ \\
$Vehicle\_van$ & $\textbf{0.8792} \pm 0.0265$ & $\textbf{0.8697} \pm 0.0383$ \\
$ionosphere$ & $\textbf{0.9320} \pm 0.0069$ & $0.9086 \pm 0.0142$ \\
$breastw$ & $\textbf{0.9948} \pm 0.0031$ & $\textbf{0.9952} \pm 0.0033$ \\
$segment$ & $\textbf{1.0}$ & $\textbf{0.9993} \pm 0.0015$ \\
$$ & $$ & $$ \\
$Average$ & $\textbf{0.8005}$ & $\textbf{0.7957}$ \\
\hline
\end{tabular}

7
prep/14statistics/q Normal file
View File

@ -0,0 +1,7 @@
Friedman test to see if there is a difference between models
Nemenyi test to see which models are equal, mark those equal to the maximum
For 2 models, Friedman not defined -> use Wilcoxon test
Does this match your expectation from the table?
Two models are 'equal' if their probability of being from the same distribution is #LessThan(p_b,p)#, what value should #Eq(p_b,0.1)# have?
Do I need to correct for p hacking (n experiments, so increase the difficulty for each, or is that clear from the table)

View File

@ -0,0 +1,9 @@
Isolation Forests are one algorithm for AD
Tries to isolate abnormal (rare) points instead of modelling normal ones
Creative approach->fairly successful (3000 Citations)
Many follow up papers
Extended Isolation Forest (Hariri et. al. 2018, 140 Citations)
Remove bias from the Isolation Forests
Also claim to improve their anomaly detection quality
(repeat with both cuts and ad quality)

0
prep/16/nonl Normal file
View File

19
prep/16/q Normal file
View File

@ -0,0 +1,19 @@
\begin{tabular}{lll}
\hline
Dataset & eIFor & IFor \\
\hline
$Delft\_pump\_5x3\_noisy$ & $\textbf{0.3893} \pm 0.0345$ & $\textbf{0.4272} \pm 0.0680$ \\
$vertebral$ & $\textbf{0.4260} \pm 0.0111$ & $\textbf{0.4554} \pm 0.0416$ \\
$Liver\_1$ & $0.5367 \pm 0.0508$ & $\textbf{0.5474} \pm 0.0541$ \\
$Sonar\_mines$ & $\textbf{0.6882} \pm 0.1264$ & $0.6189 \pm 0.1301$ \\
$letter$ & $\textbf{0.6756} \pm 0.0119$ & $0.6471 \pm 0.0111$ \\
$Glass\_building\_float$ & $\textbf{0.6480} \pm 0.1012$ & $\textbf{0.6755} \pm 0.1117$ \\
$pc3$ & $\textbf{0.7231} \pm 0.0153$ & $\textbf{0.7223} \pm 0.0178$ \\
$pima$ & $\textbf{0.7405} \pm 0.0110$ & $\textbf{0.7347} \pm 0.0126$ \\
$Diabetes\_present$ & $\textbf{0.7414} \pm 0.0195$ & $\textbf{0.7344} \pm 0.0242$ \\
$waveform-5000$ & $\textbf{0.7687} \pm 0.0123$ & $\textbf{0.7592} \pm 0.0206$ \\
$steel-plates-fault$ & $\textbf{0.7735} \pm 0.0351$ & $\textbf{0.7682} \pm 0.0402$ \\
$vowels$ & $\textbf{0.7843} \pm 0.0298$ & $\textbf{0.7753} \pm 0.0334$ \\
\hline
\end{tabular}

0
prep/17/nonl Normal file
View File

19
prep/17/q Normal file
View File

@ -0,0 +1,19 @@
\begin{tabular}{lll}
\hline
Dataset & eIFor & IFor \\
\hline
$Vowel\_0$ & $\textbf{0.8425} \pm 0.0698$ & $0.7193 \pm 0.0817$ \\
$Housing\_low$ & $\textbf{0.7807} \pm 0.0333$ & $\textbf{0.7862} \pm 0.0336$ \\
$ozone-level-8hr$ & $\textbf{0.7904} \pm 0.0207$ & $\textbf{0.7768} \pm 0.0118$ \\
$Spectf\_0$ & $\textbf{0.8155} \pm 0.0255$ & $0.7535 \pm 0.0239$ \\
$HeartC$ & $0.7795 \pm 0.0258$ & $\textbf{0.8079} \pm 0.0255$ \\
$satellite$ & $\textbf{0.8125} \pm 0.0170$ & $\textbf{0.8103} \pm 0.0061$ \\
$optdigits$ & $\textbf{0.8099} \pm 0.0310$ & $\textbf{0.8142} \pm 0.0267$ \\
$spambase$ & $\textbf{0.8085} \pm 0.0110$ & $\textbf{0.8202} \pm 0.0042$ \\
$Abalone\_1\_8$ & $\textbf{0.8525} \pm 0.0263$ & $0.8452 \pm 0.0257$ \\
$qsar-biodeg$ & $\textbf{0.8584} \pm 0.0119$ & $\textbf{0.8628} \pm 0.0135$ \\
$annthyroid$ & $0.8399 \pm 0.0135$ & $\textbf{0.9087} \pm 0.0090$ \\
$Vehicle\_van$ & $\textbf{0.8792} \pm 0.0265$ & $\textbf{0.8697} \pm 0.0383$ \\
\hline
\end{tabular}

0
prep/18/nonl Normal file
View File

18
prep/18/q Normal file
View File

@ -0,0 +1,18 @@
\begin{tabular}{lll}
\hline
Dataset & eIFor & IFor \\
\hline
$ionosphere$ & $\textbf{0.9320} \pm 0.0069$ & $0.9086 \pm 0.0142$ \\
$page-blocks$ & $0.9189 \pm 0.0061$ & $\textbf{0.9299} \pm 0.0016$ \\
$Ecoli$ & $\textbf{0.9418} \pm 0.0292$ & $0.9192 \pm 0.0332$ \\
$cardio$ & $\textbf{0.9564} \pm 0.0043$ & $\textbf{0.9535} \pm 0.0036$ \\
$wbc$ & $\textbf{0.9611} \pm 0.0121$ & $\textbf{0.9607} \pm 0.0107$ \\
$pendigits$ & $\textbf{0.9641} \pm 0.0097$ & $\textbf{0.9652} \pm 0.0076$ \\
$thyroid$ & $0.9818 \pm 0.0024$ & $\textbf{0.9871} \pm 0.0025$ \\
$breastw$ & $\textbf{0.9948} \pm 0.0031$ & $\textbf{0.9952} \pm 0.0033$ \\
$segment$ & $\textbf{1.0}$ & $\textbf{0.9993} \pm 0.0015$ \\
$$ & $$ & $$ \\
$Average$ & $\textbf{0.8005} \pm 0.1458$ & $\textbf{0.7957} \pm 0.1431$ \\
\hline
\end{tabular}

BIN
prep/19highdim/a.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 421 KiB

View File

13
prep/20New_Condition/q Normal file
View File

@ -0,0 +1,13 @@
<code>
condition= (number_of_samples>200) &
(number_of_samples<10000) &
(number_of_features>50) &
(number_of_features<500) &
~index
print(len(condition),"Datasets found")
</code>
->13 Datasets found

0
prep/21New_Models/nonl Normal file
View File

13
prep/21New_Models/q Normal file
View File

@ -0,0 +1,13 @@
<code>
from pyod.models.iforest import IForest
from pyod.models.knn import KNN
from pyod.models.lof import LOF
l=Logger({"IFor":Iforest(n_estimators=100),
"Lof":LOF(),
"Knn": KNN()}, addfeat=True)
</code>

0
prep/22/nonl Normal file
View File

21
prep/22/q Normal file
View File

@ -0,0 +1,21 @@
\begin{tabular}{llll}
\hline
Dataset & Knn & Lof & IFor \\
\hline
$Delft\_pump\_5x3\_noisy(64)$ & $0.3800 \pm 0.0475$ & $0.3462 \pm 0.0327$ & $\textbf{0.4272} \pm 0.0680$ \\
$hill-valley(100)$ & $0.4744 \pm 0.0269$ & $\textbf{0.5060} \pm 0.0327$ & $0.4720 \pm 0.0288$ \\
$speech(400)$ & $0.4903 \pm 0.0103$ & $\textbf{0.5104} \pm 0.0115$ & $0.4872 \pm 0.0184$ \\
$Sonar\_mines(60)$ & $\textbf{0.7284} \pm 0.0939$ & $0.6769 \pm 0.0933$ & $0.6189 \pm 0.1301$ \\
$ozone-level-8hr(72)$ & $\textbf{0.8051} \pm 0.0288$ & $0.7738 \pm 0.0292$ & $\textbf{0.7768} \pm 0.0118$ \\
$spambase(57)$ & $0.8038 \pm 0.0125$ & $0.7712 \pm 0.0055$ & $\textbf{0.8202} \pm 0.0042$ \\
$arrhythmia(274)$ & $\textbf{0.8137} \pm 0.0185$ & $0.8042 \pm 0.0186$ & $\textbf{0.8086} \pm 0.0099$ \\
$mnist(100)$ & $0.9345 \pm 0.0039$ & $\textbf{0.9548} \pm 0.0037$ & $0.8732 \pm 0.0069$ \\
$Concordia3\_32(256)$ & $0.9246 \pm 0.0107$ & $\textbf{0.9486} \pm 0.0099$ & $\textbf{0.9322} \pm 0.0178$ \\
$optdigits(64)$ & $0.9966 \pm 0.0012$ & $\textbf{0.9975} \pm 0.0012$ & $0.8142 \pm 0.0267$ \\
$gas-drift(128)$ & $\textbf{0.9790} \pm 0.0018$ & $0.9585 \pm 0.0055$ & $0.8764 \pm 0.0166$ \\
$Delft\_pump\_AR(160)$ & $\textbf{0.9965}$ & $\textbf{0.9953} \pm 0.0019$ & $0.9665 \pm 0.0096$ \\
$musk(166)$ & $\textbf{1.0}$ & $\textbf{1.0}$ & $0.9808 \pm 0.0117$ \\
$$ & $$ & $$ & $$ \\
$Average$ & $\textbf{0.7944}$ & $\textbf{0.7879}$ & $0.7580$ \\
\hline
\end{tabular}

0
prep/23/nonl Normal file
View File

7
prep/23/q Normal file
View File

@ -0,0 +1,7 @@
<l2st>
<e>Hypothesis: Isolation Forests are better when there are numerical and nominal attributes</e>
<e>Easy to test</e>
</l2st>
<code>
condition=condition & (numeric & nominal)
</code>

0
prep/24/nonl Normal file
View File

19
prep/24/q Normal file
View File

@ -0,0 +1,19 @@
\begin{tabular}{llll}
\hline
Dataset & Knn & IFor & Lof \\
\hline
$ozone-level-8hr(72)$ & $\textbf{0.8051} \pm 0.0288$ & $\textbf{0.7768} \pm 0.0118$ & $0.7738 \pm 0.0292$ \\
$spambase(57)$ & $0.8038 \pm 0.0125$ & $\textbf{0.8202} \pm 0.0042$ & $0.7712 \pm 0.0055$ \\
$arrhythmia(274)$ & $\textbf{0.8137} \pm 0.0185$ & $\textbf{0.8086} \pm 0.0099$ & $0.8042 \pm 0.0186$ \\
$musk(166)$ & $\textbf{1.0}$ & $0.9808 \pm 0.0117$ & $\textbf{1.0}$ \\
$$ & $$ & $$ & $$ \\
$Average$ & $\textbf{0.8556}$ & $\textbf{0.8466}$ & $\textbf{0.8373}$ \\
\hline
\end{tabular}
<l2st>
<e>Only 4 datasets, so not clear at all</e>
<e>->More datasets</e>
</l2st>

View File

@ -0,0 +1,7 @@
There are analysis that are only possible with many datasets
Here: unsupervised optimization
Given multiple AD models, find which is best:
Use AUC score? Requires Anomalies->Overfitting
Can you find an unsupervised Method?
In general very complicated, so here only focus on very small differences in the model.
So each model is an autoencoder, trained on the same dataset, where the difference is only in the initialisation

View File

@ -0,0 +1,8 @@
First guess Loss of the Model on the training Data
How to evaluate this?
Train many models, look at the average AUC score.
For the alternative, take groups of 20 models, and look at the AUC score of the best model.
Is there a meaningfull difference between results? Give result as z\_score (#(m_1-m_2)/sqrt(s_1**2+s_2**2)#)
This difference depends a lot on the dataset
->even a really good z\_score does not mean much (sometimes #LessThan(30,z)#)
(repeat with two histones)

1
prep/27loss/q Normal file
View File

@ -0,0 +1 @@
Pick the Model with the lowest l2\-loss

BIN
prep/27loss/z_loss.pdf Normal file

Binary file not shown.

3
prep/28Robustness/q Normal file
View File

@ -0,0 +1,3 @@
Pick points with 1\% width difference in input space around each point.
for each point, find the maximum difference in output space.
average this difference

Binary file not shown.

View File

@ -0,0 +1,3 @@
Pick random points in the input space.
measure the distance in input and output space
a low correlation is a good model

Binary file not shown.

9
prep/30Other/q Normal file
View File

@ -0,0 +1,9 @@
Things I still want to add:
<l2st>
Ensemble Methods
Visualisation options
Alternative Evaluations
Hyperparameter optimisation (with crossvalidation)
</l2st>

3
prep/31Feedback/q Normal file
View File

@ -0,0 +1,3 @@
What do you think about this?
Is there something I should also add?
What would you need for you to actually use this?

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 989 KiB

Some files were not shown because too many files have changed in this diff Show More