initial push

This commit is contained in:
Simon Klüttermann 2022-10-24 14:52:37 +02:00
commit dbeeb28be3
77 changed files with 2375 additions and 0 deletions

5
data/000.txt Normal file
View File

@ -0,0 +1,5 @@
<frame >
<titlepage>
</frame>

View File

@ -0,0 +1,10 @@
<frame title="Anomaly Detection">
<list>
<e>Find strange (unexpected) samples.</e>
<e>->If a traffic light is constantly yellow, probably something broke</e>
<e>But this could happen in a lot of different ways</e>
<e>->Most likely the traffic light is just off. But it could also fluctuate quickly or start smoking</e>
<e>How to cover all possible anomalies?</e>
<e>->Unsupervised Machine Learning</e>
</list>
</frame>

View File

@ -0,0 +1,14 @@
<frame title="Unsupervised Machine Learning">
<list>
<e>Normal machine learning: Input - Label</e>
<e>Here: Only Input.</e>
<e>->Instead of classifying different types, try to understand your given dataset</e>
<e>Deviations from this understanding are anomalies</e>
<l2st>
<e>x: training samples</e>
<e>tx: test samples</e>
<e>ty: test labels (is a certain sample an anomaly or not)</e>
</l2st>
<e>Useful: \emph{peak /global/cardio.npz}</e>
</list>
</frame>

14
data/003kNN.txt Normal file
View File

@ -0,0 +1,14 @@
<frame title="kNN">
<split>
<que>
<list>
<e>How to do this? Here one algorithm: kNN</e>
<e>Goal: Generate an anomaly score (high value->highly anomalous)</e>
<e>Here: The anomaly score is the distance to the kth closest samples</e>
</list>
</que>
<que>
<i f="..//prep/03kNN/yanghuang 08.png" wmode="True"></i>
</que>
</split>
</frame>

14
data/004kNN.txt Normal file
View File

@ -0,0 +1,14 @@
<frame title="kNN">
<split>
<que>
<list>
<e>How to do this? Here one algorithm: kNN</e>
<e>Goal: Generate an anomaly score (high value->highly anomalous)</e>
<e>Here: The anomaly score is the distance to the kth closest samples</e>
</list>
</que>
<que>
<i f="..//prep/04kNN/dist0.pdf" wmode="True"></i>
</que>
</split>
</frame>

3
data/005.txt Normal file
View File

@ -0,0 +1,3 @@
<frame >
<i f="..//prep/05/dist0.pdf" wmode="True"></i>
</frame>

6
data/006AUC Score.txt Normal file
View File

@ -0,0 +1,6 @@
<frame title="AUC Score">
<split>
<que w="0.47619047619047616"><i f="..//prep/06AUC_Score/02confusion.png" wmode="True"></i></que>
<que w="0.47619047619047616"><i f="..//prep/06AUC_Score/01dist0.pdf" wmode="True"></i></que>
</split>
</frame>

22
data/007AUC Score.txt Normal file
View File

@ -0,0 +1,22 @@
<frame title="AUC Score">
<split>
<que>
<list>
<e>Iterate every threshold</e>
<e>Plot fpr vs tpr</e>
<e>False Positive Rate</e>
<l2st>
<e>$\frac{FP}{FP+TN}$</e>
</l2st>
<e>True Positive Rate</e>
<l2st>
<e>$\frac{TP}{TP+FN}$</e>
</l2st>
<e>ROC-AUC: Integral of this curve!</e>
</list>
</que>
<que>
<i f="..//prep/07AUC_Score/roc.pdf" wmode="True"></i>
</que>
</split>
</frame>

9
data/008AUC Score.txt Normal file
View File

@ -0,0 +1,9 @@
<frame title="AUC Score">
<list>
<e>calculcate with \emph{sklearn.metrics.roc\_auc\_score}</e>
<e>Higher AUC score->better</e>
<e>$AUC=1.0$->Perfect seperation</e>
<e>$AUC=0.5$->Random model</e>
<e>$AUC=0.0$->Inverse seperation (every anomaly is normal, and every normal sample is anomalous)</e>
</list>
</frame>

3
data/009AUC Scores.txt Normal file
View File

@ -0,0 +1,3 @@
<frame title="AUC Scores">
<i f="..//prep/09AUC_Scores/students.png" wmode="True"></i>
</frame>

11
data/010AutoML.txt Normal file
View File

@ -0,0 +1,11 @@
<frame title="AutoML">
<list>
<e>But: We can beat this!</e>
<e>How? Hyperparameter</e>
<l2st>
<e>Every algorithm has hyperparameter that control how it works</e>
<e>For example: k in kNN (number of close points considered)</e>
</l2st>
<e>Lets take the worst algorithm (kNN: $0.927$) and try to improve it</e>
</list>
</frame>

3
data/011Optimize.txt Normal file
View File

@ -0,0 +1,3 @@
<frame title="Optimize">
<i f="..//prep/11Optimize/baseline.png" wmode="True"></i>
</frame>

3
data/012Optimize.txt Normal file
View File

@ -0,0 +1,3 @@
<frame title="Optimize">
<i f="..//prep/12Optimize/optimize.png" wmode="True"></i>
</frame>

13
data/013flaml.txt Normal file
View File

@ -0,0 +1,13 @@
<frame title="flaml">
<split>
<que>
<list>
<e>\emph{source folder/bin/activate}</e>
<e>\emph{pip install flaml}</e>
</list>
</que>
<que>
<i f="..//prep/15flaml/forflaml.png" wmode="True"></i>
</que>
</split>
</frame>

3
data/014flaml.txt Normal file
View File

@ -0,0 +1,3 @@
<frame title="flaml">
<i f="..//prep/16flaml/flaml.png" wmode="True"></i>
</frame>

3
data/015.txt Normal file
View File

@ -0,0 +1,3 @@
<frame >
<i f="..//prep/17/hist.pdf" wmode="True"></i>
</frame>

8
data/016Your Turn.txt Normal file
View File

@ -0,0 +1,8 @@
<frame title="Your Turn">
<list>
<e>Remember your last algorithm</e>
<e>Find its hyperparameters (Tip: pyod website)</e>
<e>Optimize your algorithm and give me a new AUC!</e>
<e>Bonus Question: Is there a problem with what we are doing?</e>
</list>
</frame>

12
general.txt Normal file
View File

@ -0,0 +1,12 @@
<plt>
<name Current experiment status>
<title Anomaly Detection and AutoML>
<stitle Anomaly Detection and AutoML>
<institute ls9 tu Dortmund>
<theme CambridgeUS>
<colo dolphin>
</plt>

Binary file not shown.

View File

@ -0,0 +1,6 @@
Two distributions
<l2st>
One known (=normal)
One unknown (=anomalies)
</l2st>
Seperate them

Binary file not shown.

View File

@ -0,0 +1,7 @@
Two distributions
<l2st>
One known (=normal)
One unknown (=anomalies)
</l2st>
Seperate them
Problem: few anomalies

View File

@ -0,0 +1,8 @@
Anomalies are rare, so often only a few datapoints known (e.g. Machine Failure in an Aircraft)
In practice, anomalies might appear that are not known during testing
->So train the model only on normal samples
Unsupervised Machine Learning
<l2st>
What can we say without knowing anomalies?
''Understand you dataset''
</l2st>

Binary file not shown.

Binary file not shown.

View File

@ -0,0 +1,8 @@
Anomalies are rare, so often only a few datapoints known (e.g. Machine Failure in an Aircraft)
In practice, anomalies might appear that are not known during testing
->So train the model only on normal samples
Unsupervised Machine Learning
<l2st>
What can we say without knowing anomalies?
''Understand you dataset''
</l2st>

Binary file not shown.

After

Width:  |  Height:  |  Size: 648 KiB

View File

@ -0,0 +1,6 @@
Seems easy? Now do this
<l2st>
in thousands of dimensions
with complicated distributions
and overlap between anomalies and normal points
</l2st>

BIN
old/06AutoML/Download.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

3
old/06AutoML/q Normal file
View File

@ -0,0 +1,3 @@
Most machine learning requires Hyperparameter Optimisation
(Find model parameters that result in the best results)
->AutoML: Do this automatically as fast as possible

9
old/07AutoAD/q Normal file
View File

@ -0,0 +1,9 @@
So lets combine both (Auto Anomaly Detection)
->Problem
<l2st>
AutoMl requires Evaluation (loss, accuracy, AUC) to optimize
AD can only be evaluated with regards to the anomalies
->no longer unsupervised
</l2st>
So most Anomaly Detection is ''unoptimized''

Binary file not shown.

View File

@ -0,0 +1,3 @@
So how to solve this?
One option: Think of some function to evaluate only the normal points
->A bit hard to do in a case study

View File

@ -0,0 +1,5 @@
So how to solve this?
One option: ''Just find the best solution directly''
->Zero Shot AutoML
Find best practices for hyperparameters
Requires optimisation for each model seperately -> matches the case study structure quite well!

12
old/09Course/q Normal file
View File

@ -0,0 +1,12 @@
Basics of Scientific Computing
Basics of AD
Basics of AutoML
Build groups for each algorithm
<l2st>
Choose a set of Hyperparameters
Find ''best practice`s'' for them
Maybe consider more complicated Transformations (Preprocessing, Ensemble)
</l2st>
Compare between groups (best algorithm for current situation)
Evaluate on new datasets
Write a report/Present your work

BIN
old/09Course/table.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

7
old/10Questions/q Normal file
View File

@ -0,0 +1,7 @@
Requirements:
<l2st>
MD Req 1->MD Req 8
Basic Python/Math Knowledge
Motivation to learn something new;)
</l2st>
Registration till Saturday, by Email to Simon.Kluettermann@cs.tu-dortmund.de

BIN
other/carina.pdf Normal file

Binary file not shown.

BIN
other/other.pdf Normal file

Binary file not shown.

3
out/compile.bat Normal file
View File

@ -0,0 +1,3 @@
pdflatex main.tex
pdflatex main.tex

3
out/compile.sh Executable file
View File

@ -0,0 +1,3 @@
pdflatex main.tex
pdflatex main.tex

110
out/label.json Normal file
View File

@ -0,0 +1,110 @@
[
{
"typ": "img",
"files": [
"..//prep/03kNN/yanghuang 08.png"
],
"label": "prep03kNNyanghuang 08png",
"caption": "",
"where": "../case2/data/003kNN.txt"
},
{
"typ": "img",
"files": [
"..//prep/04kNN/dist0.pdf"
],
"label": "prep04kNNdist0pdf",
"caption": "",
"where": "../case2/data/004kNN.txt"
},
{
"typ": "img",
"files": [
"..//prep/05/dist0.pdf"
],
"label": "prep05dist0pdf",
"caption": "",
"where": "../case2/data/005.txt"
},
{
"typ": "img",
"files": [
"..//prep/06AUC_Score/02confusion.png"
],
"label": "prep06AUC_Score02confusionpng",
"caption": "",
"where": "../case2/data/006AUC Score.txt"
},
{
"typ": "img",
"files": [
"..//prep/06AUC_Score/01dist0.pdf"
],
"label": "prep06AUC_Score01dist0pdf",
"caption": "",
"where": "../case2/data/006AUC Score.txt"
},
{
"typ": "img",
"files": [
"..//prep/07AUC_Score/roc.pdf"
],
"label": "prep07AUC_Scorerocpdf",
"caption": "",
"where": "../case2/data/007AUC Score.txt"
},
{
"typ": "img",
"files": [
"..//prep/09AUC_Scores/students.png"
],
"label": "prep09AUC_Scoresstudentspng",
"caption": "",
"where": "../case2/data/009AUC Scores.txt"
},
{
"typ": "img",
"files": [
"..//prep/11Optimize/baseline.png"
],
"label": "prep11Optimizebaselinepng",
"caption": "",
"where": "../case2/data/011Optimize.txt"
},
{
"typ": "img",
"files": [
"..//prep/12Optimize/optimize.png"
],
"label": "prep12Optimizeoptimizepng",
"caption": "",
"where": "../case2/data/012Optimize.txt"
},
{
"typ": "img",
"files": [
"..//prep/15flaml/forflaml.png"
],
"label": "prep15flamlforflamlpng",
"caption": "",
"where": "../case2/data/013flaml.txt"
},
{
"typ": "img",
"files": [
"..//prep/16flaml/flaml.png"
],
"label": "prep16flamlflamlpng",
"caption": "",
"where": "../case2/data/014flaml.txt"
},
{
"typ": "img",
"files": [
"..//prep/17/hist.pdf"
],
"label": "prep17histpdf",
"caption": "",
"where": "../case2/data/015.txt"
}
]

138
out/main.aux Normal file
View File

@ -0,0 +1,138 @@
\relax
\providecommand\hyper@newdestlabel[2]{}
\providecommand\HyperFirstAtBeginDocument{\AtBeginDocument}
\HyperFirstAtBeginDocument{\ifx\hyper@anchor\@undefined
\global\let\oldcontentsline\contentsline
\gdef\contentsline#1#2#3#4{\oldcontentsline{#1}{#2}{#3}}
\global\let\oldnewlabel\newlabel
\gdef\newlabel#1#2{\newlabelxx{#1}#2}
\gdef\newlabelxx#1#2#3#4#5#6{\oldnewlabel{#1}{{#2}{#3}}}
\AtEndDocument{\ifx\hyper@anchor\@undefined
\let\contentsline\oldcontentsline
\let\newlabel\oldnewlabel
\fi}
\fi}
\global\let\hyper@last\relax
\gdef\HyperFirstAtBeginDocument#1{#1}
\providecommand\HyField@AuxAddToFields[1]{}
\providecommand\HyField@AuxAddToCoFields[2]{}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{1}{1/1}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {1}{1}}}
\newlabel{Anomaly Detection<1>}{{2}{2}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Anomaly Detection<1>}{2}}
\newlabel{Anomaly Detection}{{2}{2}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Anomaly Detection}{2}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{2}{2/2}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {2}{2}}}
\newlabel{Unsupervised Machine Learning<1>}{{3}{3}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Unsupervised Machine Learning<1>}{3}}
\newlabel{Unsupervised Machine Learning}{{3}{3}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Unsupervised Machine Learning}{3}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{3}{3/3}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {3}{3}}}
\newlabel{kNN<1>}{{4}{4}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {kNN<1>}{4}}
\newlabel{kNN}{{4}{4}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {kNN}{4}}
\newlabel{fig:prep03kNNyanghuang 08png}{{4}{4}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep03kNNyanghuang 08png}{4}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{4}{4/4}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {4}{4}}}
\newlabel{kNN<1>}{{5}{5}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {kNN<1>}{5}}
\newlabel{kNN}{{5}{5}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {kNN}{5}}
\newlabel{fig:prep04kNNdist0pdf}{{5}{5}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep04kNNdist0pdf}{5}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{5}{5/5}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {5}{5}}}
\newlabel{fig:prep05dist0pdf}{{6}{6}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep05dist0pdf}{6}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{6}{6/6}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {6}{6}}}
\newlabel{AUC Score<1>}{{7}{7}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {AUC Score<1>}{7}}
\newlabel{AUC Score}{{7}{7}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {AUC Score}{7}}
\newlabel{fig:prep06AUC_Score02confusionpng}{{7}{7}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep06AUC_Score02confusionpng}{7}}
\newlabel{fig:prep06AUC_Score01dist0pdf}{{7}{7}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep06AUC_Score01dist0pdf}{7}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{7}{7/7}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {7}{7}}}
\newlabel{AUC Score<1>}{{8}{8}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {AUC Score<1>}{8}}
\newlabel{AUC Score}{{8}{8}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {AUC Score}{8}}
\newlabel{fig:prep07AUC_Scorerocpdf}{{8}{8}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep07AUC_Scorerocpdf}{8}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{8}{8/8}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {8}{8}}}
\newlabel{AUC Score<1>}{{9}{9}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {AUC Score<1>}{9}}
\newlabel{AUC Score}{{9}{9}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {AUC Score}{9}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{9}{9/9}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {9}{9}}}
\newlabel{AUC Scores<1>}{{10}{10}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {AUC Scores<1>}{10}}
\newlabel{AUC Scores}{{10}{10}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {AUC Scores}{10}}
\newlabel{fig:prep09AUC_Scoresstudentspng}{{10}{10}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep09AUC_Scoresstudentspng}{10}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{10}{10/10}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {10}{10}}}
\newlabel{AutoML<1>}{{11}{11}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {AutoML<1>}{11}}
\newlabel{AutoML}{{11}{11}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {AutoML}{11}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{11}{11/11}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {11}{11}}}
\newlabel{Optimize<1>}{{12}{12}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Optimize<1>}{12}}
\newlabel{Optimize}{{12}{12}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Optimize}{12}}
\newlabel{fig:prep11Optimizebaselinepng}{{12}{12}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep11Optimizebaselinepng}{12}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{12}{12/12}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {12}{12}}}
\newlabel{Optimize<1>}{{13}{13}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Optimize<1>}{13}}
\newlabel{Optimize}{{13}{13}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Optimize}{13}}
\newlabel{fig:prep12Optimizeoptimizepng}{{13}{13}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep12Optimizeoptimizepng}{13}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{13}{13/13}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {13}{13}}}
\newlabel{flaml<1>}{{14}{14}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {flaml<1>}{14}}
\newlabel{flaml}{{14}{14}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {flaml}{14}}
\newlabel{fig:prep15flamlforflamlpng}{{14}{14}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep15flamlforflamlpng}{14}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{14}{14/14}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {14}{14}}}
\newlabel{flaml<1>}{{15}{15}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {flaml<1>}{15}}
\newlabel{flaml}{{15}{15}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {flaml}{15}}
\newlabel{fig:prep16flamlflamlpng}{{15}{15}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep16flamlflamlpng}{15}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{15}{15/15}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {15}{15}}}
\newlabel{fig:prep17histpdf}{{16}{16}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {fig:prep17histpdf}{16}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{16}{16/16}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {16}{16}}}
\newlabel{Your Turn<1>}{{17}{17}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Your Turn<1>}{17}}
\newlabel{Your Turn}{{17}{17}{}{Doc-Start}{}}
\@writefile{snm}{\beamer@slide {Your Turn}{17}}
\@writefile{nav}{\headcommand {\slideentry {0}{0}{17}{17/17}{}{0}}}
\@writefile{nav}{\headcommand {\beamer@framepages {17}{17}}}
\@writefile{nav}{\headcommand {\beamer@partpages {1}{17}}}
\@writefile{nav}{\headcommand {\beamer@subsectionpages {1}{17}}}
\@writefile{nav}{\headcommand {\beamer@sectionpages {1}{17}}}
\@writefile{nav}{\headcommand {\beamer@documentpages {17}}}
\@writefile{nav}{\headcommand {\gdef \inserttotalframenumber {17}}}
\gdef \@abspage@last{17}

1261
out/main.log Normal file

File diff suppressed because it is too large Load Diff

39
out/main.nav Normal file
View File

@ -0,0 +1,39 @@
\headcommand {\slideentry {0}{0}{1}{1/1}{}{0}}
\headcommand {\beamer@framepages {1}{1}}
\headcommand {\slideentry {0}{0}{2}{2/2}{}{0}}
\headcommand {\beamer@framepages {2}{2}}
\headcommand {\slideentry {0}{0}{3}{3/3}{}{0}}
\headcommand {\beamer@framepages {3}{3}}
\headcommand {\slideentry {0}{0}{4}{4/4}{}{0}}
\headcommand {\beamer@framepages {4}{4}}
\headcommand {\slideentry {0}{0}{5}{5/5}{}{0}}
\headcommand {\beamer@framepages {5}{5}}
\headcommand {\slideentry {0}{0}{6}{6/6}{}{0}}
\headcommand {\beamer@framepages {6}{6}}
\headcommand {\slideentry {0}{0}{7}{7/7}{}{0}}
\headcommand {\beamer@framepages {7}{7}}
\headcommand {\slideentry {0}{0}{8}{8/8}{}{0}}
\headcommand {\beamer@framepages {8}{8}}
\headcommand {\slideentry {0}{0}{9}{9/9}{}{0}}
\headcommand {\beamer@framepages {9}{9}}
\headcommand {\slideentry {0}{0}{10}{10/10}{}{0}}
\headcommand {\beamer@framepages {10}{10}}
\headcommand {\slideentry {0}{0}{11}{11/11}{}{0}}
\headcommand {\beamer@framepages {11}{11}}
\headcommand {\slideentry {0}{0}{12}{12/12}{}{0}}
\headcommand {\beamer@framepages {12}{12}}
\headcommand {\slideentry {0}{0}{13}{13/13}{}{0}}
\headcommand {\beamer@framepages {13}{13}}
\headcommand {\slideentry {0}{0}{14}{14/14}{}{0}}
\headcommand {\beamer@framepages {14}{14}}
\headcommand {\slideentry {0}{0}{15}{15/15}{}{0}}
\headcommand {\beamer@framepages {15}{15}}
\headcommand {\slideentry {0}{0}{16}{16/16}{}{0}}
\headcommand {\beamer@framepages {16}{16}}
\headcommand {\slideentry {0}{0}{17}{17/17}{}{0}}
\headcommand {\beamer@framepages {17}{17}}
\headcommand {\beamer@partpages {1}{17}}
\headcommand {\beamer@subsectionpages {1}{17}}
\headcommand {\beamer@sectionpages {1}{17}}
\headcommand {\beamer@documentpages {17}}
\headcommand {\gdef \inserttotalframenumber {17}}

0
out/main.out Normal file
View File

BIN
out/main.pdf Normal file

Binary file not shown.

40
out/main.snm Normal file
View File

@ -0,0 +1,40 @@
\beamer@slide {Anomaly Detection<1>}{2}
\beamer@slide {Anomaly Detection}{2}
\beamer@slide {Unsupervised Machine Learning<1>}{3}
\beamer@slide {Unsupervised Machine Learning}{3}
\beamer@slide {kNN<1>}{4}
\beamer@slide {kNN}{4}
\beamer@slide {fig:prep03kNNyanghuang 08png}{4}
\beamer@slide {kNN<1>}{5}
\beamer@slide {kNN}{5}
\beamer@slide {fig:prep04kNNdist0pdf}{5}
\beamer@slide {fig:prep05dist0pdf}{6}
\beamer@slide {AUC Score<1>}{7}
\beamer@slide {AUC Score}{7}
\beamer@slide {fig:prep06AUC_Score02confusionpng}{7}
\beamer@slide {fig:prep06AUC_Score01dist0pdf}{7}
\beamer@slide {AUC Score<1>}{8}
\beamer@slide {AUC Score}{8}
\beamer@slide {fig:prep07AUC_Scorerocpdf}{8}
\beamer@slide {AUC Score<1>}{9}
\beamer@slide {AUC Score}{9}
\beamer@slide {AUC Scores<1>}{10}
\beamer@slide {AUC Scores}{10}
\beamer@slide {fig:prep09AUC_Scoresstudentspng}{10}
\beamer@slide {AutoML<1>}{11}
\beamer@slide {AutoML}{11}
\beamer@slide {Optimize<1>}{12}
\beamer@slide {Optimize}{12}
\beamer@slide {fig:prep11Optimizebaselinepng}{12}
\beamer@slide {Optimize<1>}{13}
\beamer@slide {Optimize}{13}
\beamer@slide {fig:prep12Optimizeoptimizepng}{13}
\beamer@slide {flaml<1>}{14}
\beamer@slide {flaml}{14}
\beamer@slide {fig:prep15flamlforflamlpng}{14}
\beamer@slide {flaml<1>}{15}
\beamer@slide {flaml}{15}
\beamer@slide {fig:prep16flamlflamlpng}{15}
\beamer@slide {fig:prep17histpdf}{16}
\beamer@slide {Your Turn<1>}{17}
\beamer@slide {Your Turn}{17}

497
out/main.tex Normal file
View File

@ -0,0 +1,497 @@
\UseRawInputEncoding
%\documentclass[hyperref={pdfpagelabels=false}]{beamer}
\documentclass[hyperref={pdfpagelabels=false},aspectratio=169]{beamer}
% Die Hyperref Option hyperref={pdfpagelabels=false} verhindert die Warnung:
% Package hyperref Warning: Option `pdfpagelabels' is turned off
% (hyperref) because \thepage is undefined.
% Hyperref stopped early
%
\usepackage{lmodern}
% Das Paket lmodern erspart die folgenden Warnungen:
% LaTeX Font Warning: Font shape `OT1/cmss/m/n' in size <4> not available
% (Font) size <5> substituted on input line 22.
% LaTeX Font Warning: Size substitutions with differences
% (Font) up to 1.0pt have occurred.
%
% Wenn \titel{\ldots} \author{\ldots} erst nach \begin{document} kommen,
% kommt folgende Warnung:
% Package hyperref Warning: Option `pdfauthor' has already been used,
% (hyperref) ...
% Daher steht es hier vor \begin{document}
\title[Anomaly Detection and AutoML]{Anomaly Detection and AutoML}
\author{Simon Kluettermann}
\date{\today}
\institute{ls9 tu Dortmund}
% Dadurch wird verhindert, dass die Navigationsleiste angezeigt wird.
\setbeamertemplate{navigation symbols}{}
% zusaetzlich ist das usepackage{beamerthemeshadow} eingebunden
\usepackage{beamerthemeshadow}
\hypersetup{pdfstartview={Fit}} % fits the presentation to the window when first displayed
\usepackage{appendixnumberbeamer}
\usepackage{listings}
\usetheme{CambridgeUS}
\usepackage{ngerman}
\usecolortheme{dolphin}
% \beamersetuncovermixins{\opaqueness<1>{25}}{\opaqueness<2$\Rightarrow${15}}
% sorgt dafuer das die Elemente die erst noch (zukuenftig) kommen
% nur schwach angedeutet erscheinen
%\beamersetuncovermixins{\opaqueness<1>{25}}{\opaqueness<2$\Rightarrow${15}}%here disabled
% klappt auch bei Tabellen, wenn teTeX verwendet wird\ldots
\renewcommand{\figurename}{}
\setbeamertemplate{footline}
{
\leavevmode%
\hbox{%
\begin{beamercolorbox}[wd=.4\paperwidth,ht=2.25ex,dp=1ex,center]{author in head/foot}%
\usebeamerfont{author in head/foot}\insertshorttitle
\end{beamercolorbox}%
\begin{beamercolorbox}[wd=.25\paperwidth,ht=2.25ex,dp=1ex,center]{title in head/foot}%
\usebeamerfont{title in head/foot}\insertsection
\end{beamercolorbox}%
\begin{beamercolorbox}[wd=.3499\paperwidth,ht=2.25ex,dp=1ex,right]{date in head/foot}%
\usebeamerfont{date in head/foot}\insertshortdate{}\hspace*{2em}
\hyperlink{toc}{\insertframenumber{} / \inserttotalframenumber\hspace*{2ex}}
\end{beamercolorbox}}%
\vskip0pt%
}
\usepackage[absolute,overlay]{textpos}
\usepackage{graphicx}
\newcommand{\source}[1]{\begin{textblock*}{9cm}(0.1cm,8.9cm)
\begin{beamercolorbox}[ht=0.5cm,left]{framesource}
\usebeamerfont{framesource}\usebeamercolor[fg!66]{framesource} Source: {#1}
\end{beamercolorbox}
\end{textblock*}}
\begin{document}
%from file ../case2/data/000.txt
\begin{frame}[label=]
\frametitle{}
\begin{titlepage}
\centering
{\huge\bfseries \par}
\vspace{2cm}
{\LARGE\itshape Simon Kluettermann\par}
\vspace{1.5cm}
{\scshape\Large Master Thesis in Physics\par}
\vspace{0.2cm}
{\Large submitted to the \par}
\vspace{0.2cm}
{\scshape\Large Faculty of Mathematics Computer Science and Natural Sciences \par}
\vspace{0.2cm}
{\Large \par}
\vspace{0.2cm}
{\scshape\Large RWTH Aachen University}
\vspace{1cm}
\vfill
{\scshape\Large Department of Physics\par}
\vspace{0.2cm}
{\scshape\Large Insitute for theoretical Particle Physics and Cosmology\par}
\vspace{0.2cm}
{ \Large\par}
\vspace{0.2cm}
{\Large First Referee: Prof. Dr. Michael Kraemer \par}
{\Large Second Referee: Prof. Dr. Felix Kahlhoefer}
\vfill
% Bottom of the page
{\large November 2020 \par}
\end{titlepage}
\pagenumbering{roman}
\thispagestyle{empty}
\null
\newpage
\setcounter{page}{1}
\pagenumbering{arabic}
\end{frame}
%from file ../case2/data/001Anomaly Detection.txt
\begin{frame}[label=Anomaly Detection]
\frametitle{Anomaly Detection}
\begin{itemize}
\item Find strange (unexpected) samples.
\item $\Rightarrow$If a traffic light is constantly yellow, probably something broke
\item But this could happen in a lot of different ways
\item $\Rightarrow$Most likely the traffic light is just off. But it could also fluctuate quickly or start smoking
\item How to cover all possible anomalies?
\item $\Rightarrow$Unsupervised Machine Learning
\end{itemize}
\end{frame}
%from file ../case2/data/002Unsupervised Machine Learning.txt
\begin{frame}[label=Unsupervised Machine Learning]
\frametitle{Unsupervised Machine Learning}
\begin{itemize}
\item Normal machine learning: Input - Label
\item Here: Only Input.
\item $\Rightarrow$Instead of classifying different types, try to understand your given dataset
\item Deviations from this understanding are anomalies
\begin{itemize}
\item x: training samples
\item tx: test samples
\item ty: test labels (is a certain sample an anomaly or not)
\end{itemize}
\item Useful: \emph{peak /global/cardio.npz}
\end{itemize}
\end{frame}
%from file ../case2/data/003kNN.txt
\begin{frame}[label=kNN]
\frametitle{kNN}
\begin{columns}[c] % align columns
\begin{column}{0.48\textwidth}%.48
\begin{itemize}
\item How to do this? Here one algorithm: kNN
\item Goal: Generate an anomaly score (high value$\Rightarrow$highly anomalous)
\item Here: The anomaly score is the distance to the kth closest samples
\end{itemize}
\end{column}%
\hfill%
\begin{column}{0.48\textwidth}%.48
\begin{figure}[H]
\centering
\includegraphics[width=0.9\textwidth]{..//prep/03kNN/yanghuang 08.png}
\label{fig:prep03kNNyanghuang 08png}
\caption{[Yang, Huang 08]}
\end{figure}
\end{column}%
\hfill%
\end{columns}
\end{frame}
%from file ../case2/data/004kNN.txt
\begin{frame}[label=kNN]
\frametitle{kNN}
\begin{columns}[c] % align columns
\begin{column}{0.48\textwidth}%.48
\begin{itemize}
\item How to do this? Here one algorithm: kNN
\item Goal: Generate an anomaly score (high value$\Rightarrow$highly anomalous)
\item Here: The anomaly score is the distance to the kth closest samples
\end{itemize}
\end{column}%
\hfill%
\begin{column}{0.48\textwidth}%.48
\begin{figure}[H]
\centering
\includegraphics[width=0.9\textwidth]{..//prep/04kNN/dist0.pdf}
\label{fig:prep04kNNdist0pdf}
\end{figure}
\end{column}%
\hfill%
\end{columns}
\end{frame}
%from file ../case2/data/005.txt
\begin{frame}[label=]
\frametitle{}
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{..//prep/05/dist0.pdf}
\label{fig:prep05dist0pdf}
\end{figure}
\end{frame}
%from file ../case2/data/006AUC Score.txt
\begin{frame}[label=AUC Score]
\frametitle{AUC Score}
\begin{columns}[c] % align columns
\begin{column}{0.47619047619047616\textwidth}%.48
\begin{figure}[H]
\centering
\includegraphics[width=0.9\textwidth]{..//prep/06AUC_Score/02confusion.png}
\label{fig:prep06AUC_Score02confusionpng}
\end{figure}
\end{column}%
\hfill%
\begin{column}{0.47619047619047616\textwidth}%.48
\begin{figure}[H]
\centering
\includegraphics[width=0.9\textwidth]{..//prep/06AUC_Score/01dist0.pdf}
\label{fig:prep06AUC_Score01dist0pdf}
\end{figure}
\end{column}%
\hfill%
\end{columns}
\end{frame}
%from file ../case2/data/007AUC Score.txt
\begin{frame}[label=AUC Score]
\frametitle{AUC Score}
\begin{columns}[c] % align columns
\begin{column}{0.48\textwidth}%.48
\begin{itemize}
\item Iterate every threshold
\item Plot fpr vs tpr
\item False Positive Rate
\begin{itemize}
\item $\frac{FP}{FP+TN}$
\end{itemize}
\item True Positive Rate
\begin{itemize}
\item $\frac{TP}{TP+FN}$
\end{itemize}
\item ROC-AUC: Integral of this curve!
\end{itemize}
\end{column}%
\hfill%
\begin{column}{0.48\textwidth}%.48
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{..//prep/07AUC_Score/roc.pdf}
\label{fig:prep07AUC_Scorerocpdf}
\end{figure}
\end{column}%
\hfill%
\end{columns}
\end{frame}
%from file ../case2/data/008AUC Score.txt
\begin{frame}[label=AUC Score]
\frametitle{AUC Score}
\begin{itemize}
\item calculcate with \emph{sklearn.metrics.roc\_auc\_score}
\item Higher AUC score$\Rightarrow$better
\item $AUC=1.0$$\Rightarrow$Perfect seperation
\item $AUC=0.5$$\Rightarrow$Random model
\item $AUC=0.0$$\Rightarrow$Inverse seperation (every anomaly is normal, and every normal sample is anomalous)
\end{itemize}
\end{frame}
%from file ../case2/data/009AUC Scores.txt
\begin{frame}[label=AUC Scores]
\frametitle{AUC Scores}
\begin{figure}[H]
\centering
\includegraphics[width=0.9\textwidth]{..//prep/09AUC_Scores/students.png}
\label{fig:prep09AUC_Scoresstudentspng}
\end{figure}
\end{frame}
%from file ../case2/data/010AutoML.txt
\begin{frame}[label=AutoML]
\frametitle{AutoML}
\begin{itemize}
\item But: We can beat this!
\item How? Hyperparameter
\begin{itemize}
\item Every algorithm has hyperparameter that control how it works
\item For example: k in kNN (number of close points considered)
\end{itemize}
\item Lets take the worst algorithm (kNN: $0.927$) and try to improve it
\end{itemize}
\end{frame}
%from file ../case2/data/011Optimize.txt
\begin{frame}[label=Optimize]
\frametitle{Optimize}
\begin{figure}[H]
\centering
\includegraphics[width=0.9\textwidth]{..//prep/11Optimize/baseline.png}
\label{fig:prep11Optimizebaselinepng}
\end{figure}
\end{frame}
%from file ../case2/data/012Optimize.txt
\begin{frame}[label=Optimize]
\frametitle{Optimize}
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{..//prep/12Optimize/optimize.png}
\label{fig:prep12Optimizeoptimizepng}
\end{figure}
\end{frame}
%from file ../case2/data/013flaml.txt
\begin{frame}[label=flaml]
\frametitle{flaml}
\begin{columns}[c] % align columns
\begin{column}{0.48\textwidth}%.48
\begin{itemize}
\item \emph{source folder/bin/activate}
\item \emph{pip install flaml}
\end{itemize}
\end{column}%
\hfill%
\begin{column}{0.48\textwidth}%.48
\begin{figure}[H]
\centering
\includegraphics[width=0.9\textwidth]{..//prep/15flaml/forflaml.png}
\label{fig:prep15flamlforflamlpng}
\end{figure}
\end{column}%
\hfill%
\end{columns}
\end{frame}
%from file ../case2/data/014flaml.txt
\begin{frame}[label=flaml]
\frametitle{flaml}
\begin{figure}[H]
\centering
\includegraphics[width=0.9\textwidth]{..//prep/16flaml/flaml.png}
\label{fig:prep16flamlflamlpng}
\end{figure}
\end{frame}
%from file ../case2/data/015.txt
\begin{frame}[label=]
\frametitle{}
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{..//prep/17/hist.pdf}
\label{fig:prep17histpdf}
\end{figure}
\end{frame}
%from file ../case2/data/016Your Turn.txt
\begin{frame}[label=Your Turn]
\frametitle{Your Turn}
\begin{itemize}
\item Remember your last algorithm
\item Find its hyperparameters (Tip: pyod website)
\item Optimize your algorithm and give me a new AUC!
\item Bonus Question: Is there a problem with what we are doing?
\end{itemize}
\end{frame}
\end{document}

0
out/main.toc Normal file
View File

0
prep/000/nonl Normal file
View File

1
prep/000/q Normal file
View File

@ -0,0 +1 @@
<titlepage>

View File

@ -0,0 +1,6 @@
Find strange (unexpected) samples.
->If a traffic light is constantly yellow, probably something broke
But this could happen in a lot of different ways
->Most likely the traffic light is just off. But it could also fluctuate quickly or start smoking
How to cover all possible anomalies?
->Unsupervised Machine Learning

View File

@ -0,0 +1,10 @@
Normal machine learning: Input - Label
Here: Only Input.
->Instead of classifying different types, try to understand your given dataset
Deviations from this understanding are anomalies
<l2st>
x: training samples
tx: test samples
ty: test labels (is a certain sample an anomaly or not)
</l2st>
Useful: \emph{peak /global/cardio.npz}

3
prep/03kNN/q Normal file
View File

@ -0,0 +1,3 @@
How to do this? Here one algorithm: kNN
Goal: Generate an anomaly score (high value->highly anomalous)
Here: The anomaly score is the distance to the kth closest samples

BIN
prep/03kNN/yanghuang 08.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

BIN
prep/04kNN/dist0.pdf Normal file

Binary file not shown.

3
prep/04kNN/q Normal file
View File

@ -0,0 +1,3 @@
How to do this? Here one algorithm: kNN
Goal: Generate an anomaly score (high value->highly anomalous)
Here: The anomaly score is the distance to the kth closest samples

BIN
prep/05/dist0.pdf Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

13
prep/07AUC_Score/q Normal file
View File

@ -0,0 +1,13 @@
Iterate every threshold
Plot fpr vs tpr
False Positive Rate
<l2st>
$\frac{FP}{FP+TN}$
</l2st>
True Positive Rate
<l2st>
$\frac{TP}{TP+FN}$
</l2st>
ROC-AUC: Integral of this curve!

BIN
prep/07AUC_Score/roc.pdf Normal file

Binary file not shown.

5
prep/08AUC_Score/q Normal file
View File

@ -0,0 +1,5 @@
calculcate with \emph{sklearn.metrics.roc\_auc\_score}
Higher AUC score->better
$AUC=1.0$->Perfect seperation
$AUC=0.5$->Random model
$AUC=0.0$->Inverse seperation (every anomaly is normal, and every normal sample is anomalous)

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

7
prep/10AutoML/q Normal file
View File

@ -0,0 +1,7 @@
But: We can beat this!
How? Hyperparameter
<l2st>
Every algorithm has hyperparameter that control how it works
For example: k in kNN (number of close points considered)
</l2st>
Lets take the worst algorithm (kNN: $0.927$) and try to improve it

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

BIN
prep/15flaml/forflaml.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

2
prep/15flaml/q Normal file
View File

@ -0,0 +1,2 @@
\emph{source folder/bin/activate}
\emph{pip install flaml}

BIN
prep/16flaml/flaml.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

BIN
prep/17/hist.pdf Normal file

Binary file not shown.

4
prep/18Your_Turn/q Normal file
View File

@ -0,0 +1,4 @@
Remember your last algorithm
Find its hyperparameters (Tip: pyod website)
Optimize your algorithm and give me a new AUC!
Bonus Question: Is there a problem with what we are doing?

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 989 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 265 KiB