semi final push
This commit is contained in:
parent
a05699269c
commit
ea54626a20
|
@ -5,7 +5,7 @@
|
||||||
%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%
|
||||||
|
|
||||||
% Talk's title
|
% Talk's title
|
||||||
\settalktitle{Anomaly Detection Seminar 2021/2022}
|
\settalktitle{Seminar Unsupervised Machine Learning - Anomaly Detection}
|
||||||
|
|
||||||
% Author's name
|
% Author's name
|
||||||
\settalkauthor{}
|
\settalkauthor{}
|
||||||
|
|
14
slides.blg
14
slides.blg
|
@ -1,9 +1,9 @@
|
||||||
[0] Config.pm:312> INFO - This is Biber 2.15
|
[0] Config.pm:312> INFO - This is Biber 2.15
|
||||||
[0] Config.pm:315> INFO - Logfile is 'slides.blg'
|
[0] Config.pm:315> INFO - Logfile is 'slides.blg'
|
||||||
[53] biber:330> INFO - === Mo Okt 11, 2021, 17:26:11
|
[51] biber:330> INFO - === Mo Okt 11, 2021, 17:38:14
|
||||||
[63] Biber.pm:415> INFO - Reading 'slides.bcf'
|
[61] Biber.pm:415> INFO - Reading 'slides.bcf'
|
||||||
[115] Biber.pm:952> INFO - Found 0 citekeys in bib section 0
|
[111] Biber.pm:952> INFO - Found 0 citekeys in bib section 0
|
||||||
[119] Utils.pm:395> WARN - The file 'slides.bcf' does not contain any citations!
|
[116] Utils.pm:395> WARN - The file 'slides.bcf' does not contain any citations!
|
||||||
[125] bbl.pm:651> INFO - Writing 'slides.bbl' with encoding 'UTF-8'
|
[122] bbl.pm:651> INFO - Writing 'slides.bbl' with encoding 'UTF-8'
|
||||||
[126] bbl.pm:754> INFO - Output to slides.bbl
|
[122] bbl.pm:754> INFO - Output to slides.bbl
|
||||||
[126] Biber.pm:128> INFO - WARNINGS: 1
|
[122] Biber.pm:128> INFO - WARNINGS: 1
|
||||||
|
|
38
slides.log
38
slides.log
|
@ -1,4 +1,4 @@
|
||||||
This is LuaHBTeX, Version 1.13.0 (TeX Live 2021/Arch Linux) (format=lualatex 2021.6.8) 11 OCT 2021 17:26
|
This is LuaHBTeX, Version 1.13.0 (TeX Live 2021/Arch Linux) (format=lualatex 2021.6.8) 11 OCT 2021 17:38
|
||||||
system commands enabled.
|
system commands enabled.
|
||||||
**slides
|
**slides
|
||||||
(./slides.tex
|
(./slides.tex
|
||||||
|
@ -2068,12 +2068,12 @@ Package polyglossia Info: Option: English, variant=american.
|
||||||
Package polyglossia Info: Option: english variant=american (with additional pat
|
Package polyglossia Info: Option: english variant=american (with additional pat
|
||||||
terns).
|
terns).
|
||||||
Module polyglossia Info: Language data for usenglishmax
|
Module polyglossia Info: Language data for usenglishmax
|
||||||
(polyglossia) patterns hyph-en-us.pat.txt
|
|
||||||
(polyglossia) hyphenation hyph-en-us.hyp.txt
|
|
||||||
(polyglossia) righthyphenmin 3
|
|
||||||
(polyglossia) lefthyphenmin 2
|
|
||||||
(polyglossia) loader loadhyph-en-us.tex
|
(polyglossia) loader loadhyph-en-us.tex
|
||||||
(polyglossia) synonyms on input line 35
|
(polyglossia) hyphenation hyph-en-us.hyp.txt
|
||||||
|
(polyglossia) synonyms
|
||||||
|
(polyglossia) lefthyphenmin 2
|
||||||
|
(polyglossia) patterns hyph-en-us.pat.txt
|
||||||
|
(polyglossia) righthyphenmin 3 on input line 35
|
||||||
Module polyglossia Info: Language usenglishmax was not yet loaded; created with
|
Module polyglossia Info: Language usenglishmax was not yet loaded; created with
|
||||||
id 2 on input line 35
|
id 2 on input line 35
|
||||||
Package polyglossia Info: Option: english variant=american (with additional pat
|
Package polyglossia Info: Option: english variant=american (with additional pat
|
||||||
|
@ -2135,12 +2135,12 @@ braces):
|
||||||
> {german/localnumeral} => {polyglossia@C@localnumeral}
|
> {german/localnumeral} => {polyglossia@C@localnumeral}
|
||||||
> {german/Localnumeral} => {polyglossia@C@localnumeral}.
|
> {german/Localnumeral} => {polyglossia@C@localnumeral}.
|
||||||
Module polyglossia Info: Language data for german
|
Module polyglossia Info: Language data for german
|
||||||
(polyglossia) patterns hyph-de-1901.pat.txt
|
|
||||||
(polyglossia) hyphenation
|
|
||||||
(polyglossia) righthyphenmin 2
|
|
||||||
(polyglossia) lefthyphenmin 2
|
|
||||||
(polyglossia) loader loadhyph-de-1901.tex
|
(polyglossia) loader loadhyph-de-1901.tex
|
||||||
(polyglossia) synonyms on input line 10
|
(polyglossia) hyphenation
|
||||||
|
(polyglossia) synonyms
|
||||||
|
(polyglossia) lefthyphenmin 2
|
||||||
|
(polyglossia) patterns hyph-de-1901.pat.txt
|
||||||
|
(polyglossia) righthyphenmin 2 on input line 10
|
||||||
Module polyglossia Info: Language german was not yet loaded; created with id 3 o
|
Module polyglossia Info: Language german was not yet loaded; created with id 3 o
|
||||||
n input line 10
|
n input line 10
|
||||||
Package polyglossia Info: Option: German, spelling=new.
|
Package polyglossia Info: Option: German, spelling=new.
|
||||||
|
@ -3104,12 +3104,12 @@ Package biblatex Info: ... file 'german.lbx' found.
|
||||||
(/usr/share/texmf-dist/tex/latex/biblatex/lbx/german.lbx
|
(/usr/share/texmf-dist/tex/latex/biblatex/lbx/german.lbx
|
||||||
File: german.lbx 2020/12/31 v3.16 biblatex localization (PK/MW)
|
File: german.lbx 2020/12/31 v3.16 biblatex localization (PK/MW)
|
||||||
Module polyglossia Info: Language data for ngerman
|
Module polyglossia Info: Language data for ngerman
|
||||||
(polyglossia) patterns hyph-de-1996.pat.txt
|
|
||||||
(polyglossia) hyphenation
|
|
||||||
(polyglossia) righthyphenmin 2
|
|
||||||
(polyglossia) lefthyphenmin 2
|
|
||||||
(polyglossia) loader loadhyph-de-1996.tex
|
(polyglossia) loader loadhyph-de-1996.tex
|
||||||
(polyglossia) synonyms on input line 561
|
(polyglossia) hyphenation
|
||||||
|
(polyglossia) synonyms
|
||||||
|
(polyglossia) lefthyphenmin 2
|
||||||
|
(polyglossia) patterns hyph-de-1996.pat.txt
|
||||||
|
(polyglossia) righthyphenmin 2 on input line 561
|
||||||
Module polyglossia Info: Language ngerman was not yet loaded; created with id 5
|
Module polyglossia Info: Language ngerman was not yet loaded; created with id 5
|
||||||
on input line 561
|
on input line 561
|
||||||
)
|
)
|
||||||
|
@ -3961,15 +3961,15 @@ Here is how much of LuaTeX's memory you used:
|
||||||
n, 63 penalty, 5 margin_kern, 361 glyph, 256 attribute, 92 glue_spec, 256 attrib
|
n, 63 penalty, 5 margin_kern, 361 glyph, 256 attribute, 92 glue_spec, 256 attrib
|
||||||
ute_list, 4 write, 24 pdf_literal, 92 pdf_colorstack, 1 pdf_setmatrix, 1 pdf_sav
|
ute_list, 4 write, 24 pdf_literal, 92 pdf_colorstack, 1 pdf_setmatrix, 1 pdf_sav
|
||||||
e, 1 pdf_restore nodes
|
e, 1 pdf_restore nodes
|
||||||
avail lists: 1:3,2:387,3:215,4:335,5:215,6:59,7:2079,8:9,9:466,10:24,11:136,1
|
avail lists: 1:4,2:387,3:215,4:335,5:215,6:59,7:2079,8:9,9:466,10:24,11:136,1
|
||||||
2:1
|
2:1
|
||||||
81249 multiletter control sequences out of 65536+600000
|
81249 multiletter control sequences out of 65536+600000
|
||||||
116 fonts using 34613951 bytes
|
116 fonts using 34614111 bytes
|
||||||
136i,20n,154p,819b,2327s stack positions out of 5000i,500n,10000p,200000b,80000s
|
136i,20n,154p,819b,2327s stack positions out of 5000i,500n,10000p,200000b,80000s
|
||||||
</usr/share/texmf-dist/fonts/opentype/public/libertinus-fonts/LibertinusSans-Bol
|
</usr/share/texmf-dist/fonts/opentype/public/libertinus-fonts/LibertinusSans-Bol
|
||||||
d.otf></usr/share/texmf-dist/fonts/opentype/public/libertinus-fonts/LibertinusSa
|
d.otf></usr/share/texmf-dist/fonts/opentype/public/libertinus-fonts/LibertinusSa
|
||||||
ns-Regular.otf>
|
ns-Regular.otf>
|
||||||
Output written on slides.pdf (22 pages, 2035288 bytes).
|
Output written on slides.pdf (22 pages, 2035304 bytes).
|
||||||
|
|
||||||
PDF statistics: 262 PDF objects out of 1000 (max. 8388607)
|
PDF statistics: 262 PDF objects out of 1000 (max. 8388607)
|
||||||
172 compressed objects within 2 object streams
|
172 compressed objects within 2 object streams
|
||||||
|
|
BIN
slides.pdf
BIN
slides.pdf
Binary file not shown.
|
@ -39,8 +39,8 @@
|
||||||
\begin{columns}
|
\begin{columns}
|
||||||
\begin{column}{.475\textwidth}
|
\begin{column}{.475\textwidth}
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item Kick-Off
|
\item Kick-Off Meeting
|
||||||
\item Some Formal Stuff
|
\item Some Formalities
|
||||||
\item Short Overview of the Topics
|
\item Short Overview of the Topics
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
\begin{center}
|
\begin{center}
|
||||||
|
@ -53,7 +53,7 @@
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item Choose a couple topics
|
\item Choose a couple topics
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item Since we are only a few, you can make these requests quite complicated if you like (I prefer topic 1, but I would also take 3 or 7, except when I can do it in german, then I would prefer topic 12)
|
\item Since we are only a few, you can make these requests quite complicated (I prefer topic 1, but I would also take 3 or 7, except when I can do it in german, then I would prefer topic 12)
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
\item Send your choice to Simon.Kluettermann@cs.tu-dortmund.de (till tomorrow 13.10.2021 23:59)
|
\item Send your choice to Simon.Kluettermann@cs.tu-dortmund.de (till tomorrow 13.10.2021 23:59)
|
||||||
\item You will be assigned one in the next days
|
\item You will be assigned one in the next days
|
||||||
|
|
91
summary.txt
91
summary.txt
|
@ -1,63 +1,66 @@
|
||||||
Contextual Outliers
|
1) Anomaly Detection for Monitoring
|
||||||
This Paper focuses on interpretability of anomaly detection methods. The Method described works by splitting up the set of normal events into groups and tries to relate any abnormal event to its surrounding normal ones. I would say it is more practical and I want to strongly encurage you to implement this algorithm if choosen.
|
This is more a book and less of a Paper. So it should be perfect for you if you have not that much experience. If focusses on Time Series Analyses, namely the Task of detection when a continous datastream becomes anomalous. This is for examle useful for a machine supervised by sensors that at some point stops working (and thus changes the sensor output)
|
||||||
https://arxiv.org/abs/1711.10589
|
https://assets.dynatrace.com/content/dam/en/wp/Anomaly-Detection-for-Monitoring-Ruxit.pdf
|
||||||
|
|
||||||
Active AD via Ensembles...
|
2) A comprehensive survey of anomaly detection techniques for high dimensional big data
|
||||||
This Paper tries do to a lot. I suggest that you focus on the active learning part. Alternatively we have also a paper on ensembles so if you both want, you can combine these papers to be worked on by two Students. Active AD extends the task of finding anomalies to the case in which the anomaly status of the training events is not clearly defined. Its focus here lies in minimizing the amount of human work needed to classify a given dataset (given some labels, train a model, find those new events that are unclear, classify those, restart).
|
Anomaly Detection is generally more complicated when you are given higher dimensional data (Curse of dimensionality). This seems a little weird, as usually machine learning improves when you are given more informations. I imagine it as useless features confusing this algorithm. This Paper could be seen as a study of this phenomena.
|
||||||
I want to note here, that great work on an easy topic if for us the same as good work on a hard topic.
|
https://journalofbigdata.springeropen.com/track/pdf/10.1186/s40537-020-00320-x.pdf
|
||||||
https://arxiv.org/pdf/1901.08930
|
|
||||||
|
|
||||||
Interpretable AD for Device Failure
|
3) A Comprehensive Survey on Graph Anomaly Detection with Deep Learning
|
||||||
This is an Application Paper. Its complexity comes mostly from the fact that real world data is messy and the Paper addresses ways to mitigate this.
|
|
||||||
https://arxiv.org/pdf/2007.10088
|
|
||||||
|
|
||||||
Neural Transformation Learning for Deep Anomaly Detection Beyond Images
|
|
||||||
While for Image data, certain pre-Transformations(like Rotations) can clearly improve Machine Learning Tasks like Anomaly Detection, this is much less well defined for Time-Series/Tabular data. This Paper tries to solve this by defining learnable Transformations.
|
|
||||||
https://arxiv.org/pdf/2103.16440
|
|
||||||
|
|
||||||
Unsupervised Anomaly Detection Ensembles using Item Response Theory
|
|
||||||
Different AD algorithms are usually better at finding different types of anomalies. To get a more general algorithm you can combine multiple ones into one using Ensembles.
|
|
||||||
This Paper could be merged together with "Active AD via Ensembles" to be handled by two students.
|
|
||||||
https://arxiv.org/pdf/2106.06243
|
|
||||||
|
|
||||||
A Comprehensive Survey on Graph Anomaly Detection with Deep Learning
|
|
||||||
A lot of datasets that are interesting to AD (For example Email Communications or Trading Data) can be best represented as graphs. This provides unique challeges for AD algorithms.
|
A lot of datasets that are interesting to AD (For example Email Communications or Trading Data) can be best represented as graphs. This provides unique challeges for AD algorithms.
|
||||||
This is a paper that could either be handled by two students or split up into two. Maybe one considers anomalous graphs, while the other one considers anomalous nodes in graphs.
|
This is a paper that could either be handled by two students or split up into two. Maybe one considers anomalous graphs, while the other one considers anomalous nodes in graphs.
|
||||||
https://arxiv.org/pdf/2106.07178
|
https://arxiv.org/pdf/2106.07178
|
||||||
|
|
||||||
Additive Explanations for Anomalies Detected from Multivariate Temporal Data
|
4) LOF: identifying Density-Based Local Outliers
|
||||||
|
LOF is a classical algorithm used in many Applications. This is the original Paper introducing it. As this is a fairly old Paper, you will also find a lot of other sources describing LOF.
|
||||||
|
https://www.dbs.ifi.lmu.de/Publikationen/Papers/LOF.pdf
|
||||||
|
|
||||||
|
5) HiCS: High Contrast Subspaces for Density-Based Outlier Ranking
|
||||||
|
As most datapoints are quite high-dimensional, it is often the case that some features are useless and could actually take part in hiding the true abnomalities. This Paper suggests a method to select a subspace that filters out unimportant features.
|
||||||
|
This paper was cowritten by Prof. Müller and might be related to a future Masters thesis.
|
||||||
|
https://www.ipd.kit.edu/~muellere/publications/ICDE2012.pdf
|
||||||
|
|
||||||
|
6) Neural Transformation Learning for Deep Anomaly Detection Beyond Images
|
||||||
|
While for Image data, certain pre-Transformations(like Rotations) can clearly improve Machine Learning Tasks like Anomaly Detection, this is much less well defined for Time-Series/Tabular data. This Paper tries to solve this by defining learnable Transformations.
|
||||||
|
https://arxiv.org/pdf/2103.16440
|
||||||
|
|
||||||
|
7) A Survey on GANs for Anomaly Detection
|
||||||
|
GANs are an advanced ML method, normally used to generate really realistic artificial Images (check out https://thispersondoesnotexist.com/ if you have never done so). But these can also be used for anomaly detection. Your task would be to explain how.
|
||||||
|
https://arxiv.org/pdf/1906.11632
|
||||||
|
|
||||||
|
|
||||||
|
8) Unsupervised Anomaly Detection Ensembles using Item Response Theory
|
||||||
|
Different AD algorithms are usually better at finding different types of anomalies. To get a more general algorithm you can combine multiple ones into one using Ensembles.
|
||||||
|
This Paper could be merged together with "Active AD via Ensembles" to be handled by two students.
|
||||||
|
https://arxiv.org/pdf/2106.06243
|
||||||
|
|
||||||
|
|
||||||
|
9) Active AD via Ensembles...
|
||||||
|
This Paper tries do to a lot. I suggest that you focus on the active learning part. Alternatively we have also a paper on ensembles so if you both want, you can combine these papers to be worked on by two Students. Active AD extends the task of finding anomalies to the case in which the anomaly status of the training events is not clearly defined. Its focus here lies in minimizing the amount of human work needed to classify a given dataset (given some labels, train a model, find those new events that are unclear, classify those, restart).
|
||||||
|
I want to note here, that great work on an easy topic if for us the same as good work on a hard topic.
|
||||||
|
https://arxiv.org/pdf/1901.08930
|
||||||
|
|
||||||
|
10) Contextual Outliers
|
||||||
|
This Paper focuses on interpretability of anomaly detection methods. The Method described works by splitting up the set of normal events into groups and tries to relate any abnormal event to its surrounding normal ones. I would say it is more practical and I want to strongly encurage you to implement this algorithm if choosen.
|
||||||
|
https://arxiv.org/abs/1711.10589
|
||||||
|
|
||||||
|
|
||||||
|
11) Additive Explanations for Anomalies Detected from Multivariate Temporal Data
|
||||||
Explaining why a given event is anomalous can be as important as detecting it, as it helps to create Trust. This Paper suggests a Method that is based on differentiating between features that contribute more and less.
|
Explaining why a given event is anomalous can be as important as detecting it, as it helps to create Trust. This Paper suggests a Method that is based on differentiating between features that contribute more and less.
|
||||||
It is also a quite short paper, so it is extra important to look for other papers.
|
It is also a quite short paper, so it is extra important to look for other papers.
|
||||||
https://dl.acm.org/doi/abs/10.1145/3357384.3358121
|
https://dl.acm.org/doi/abs/10.1145/3357384.3358121
|
||||||
requires vpn, contact me if you have problems with this
|
requires vpn, contact me if you have problems with this
|
||||||
|
|
||||||
Anomaly Detection for Monitoring
|
12) Interpretable AD for Device Failure
|
||||||
This is more a book and less of a Paper. So it should be perfect for you if you have not that much experience. If focusses on Time Series Analyses, namely the Task of detection when a continous datastream becomes anomalous. This is for examle useful for a machine supervised by sensors that at some point stops working (and thus changes the sensor output)
|
This is an Application Paper. Its complexity comes mostly from the fact that real world data is messy and the Paper addresses ways to mitigate this.
|
||||||
https://assets.dynatrace.com/content/dam/en/wp/Anomaly-Detection-for-Monitoring-Ruxit.pdf
|
https://arxiv.org/pdf/2007.10088
|
||||||
|
|
||||||
Fast Unsupervised Anomaly Detection in Traffic Videos
|
13) Fast Unsupervised Anomaly Detection in Traffic Videos
|
||||||
This is another Application Paper. Its main complexity is the Input data type, as this uses videos (which are very high dimensional and contain temporal correlations). You will see how good preprocessing can make even a basic algorithm viable for complicated problems.
|
This is another Application Paper. Its main complexity is the Input data type, as this uses videos (which are very high dimensional and contain temporal correlations). You will see how good preprocessing can make even a basic algorithm viable for complicated problems.
|
||||||
https://openaccess.thecvf.com/content_CVPRW_2020/papers/w35/Doshi_Fast_Unsupervised_Anomaly_Detection_in_Traffic_Videos_CVPRW_2020_paper.pdf
|
https://openaccess.thecvf.com/content_CVPRW_2020/papers/w35/Doshi_Fast_Unsupervised_Anomaly_Detection_in_Traffic_Videos_CVPRW_2020_paper.pdf
|
||||||
|
|
||||||
HiCS: High Contrast Subspaces for Density-Based Outlier Ranking
|
|
||||||
As most datapoints are quite high-dimensional, it is often the case that some features are useless and could actually take part in hiding the true abnomalities. This Paper suggests a method to select a subspace that filters out unimportant features.
|
|
||||||
This paper was cowritten by Prof. Müller and might be related to a future Masters thesis.
|
|
||||||
https://www.ipd.kit.edu/~muellere/publications/ICDE2012.pdf
|
|
||||||
|
|
||||||
LOF: identifying Density-Based Local Outliers
|
14) Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding
|
||||||
LOF is a classical algorithm used in many Applications. This is the original Paper introducing it. As this is a fairly old Paper, you will also find a lot of other sources describing LOF.
|
|
||||||
https://www.dbs.ifi.lmu.de/Publikationen/Papers/LOF.pdf
|
|
||||||
|
|
||||||
A comprehensive survey of anomaly detection techniques for high dimensional big data
|
|
||||||
Anomaly Detection is generally more complicated when you are given higher dimensional data (Curse of dimensionality). This seems a little weird, as usually machine learning improves when you are given more informations. I imagine it as useless features confusing this algorithm. This Paper could be seen as a study of this phenomena.
|
|
||||||
https://journalofbigdata.springeropen.com/track/pdf/10.1186/s40537-020-00320-x.pdf
|
|
||||||
|
|
||||||
A Survey on GANs for Anomaly Detection
|
|
||||||
GANs are an advanced ML method, normally used to generate really realistic artificial Images (check out https://thispersondoesnotexist.com/ if you have never done so). But these can also be used for anomaly detection. Your task would be to explain how.
|
|
||||||
https://arxiv.org/pdf/1906.11632
|
|
||||||
|
|
||||||
Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding
|
|
||||||
This is another Application Paper, but this time using a more complicated algorithms from recurrent ML. It tries to monitor the evergrowing amount of spacecrafts for anomalous behaviour.
|
This is another Application Paper, but this time using a more complicated algorithms from recurrent ML. It tries to monitor the evergrowing amount of spacecrafts for anomalous behaviour.
|
||||||
https://arxiv.org/pdf/1802.04431
|
https://arxiv.org/pdf/1802.04431
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue