mlsem/summary.txt

1) Anomaly Detection for Monitoring
This is more a book and less of a Paper. So it should be perfect for you if you have not that much experience. If focusses on Time Series Analyses, namely the Task of detection when a continous datastream becomes anomalous. This is for examle useful for a machine supervised by sensors that at some point stops working (and thus changes the sensor output)
https://assets.dynatrace.com/content/dam/en/wp/Anomaly-Detection-for-Monitoring-Ruxit.pdf

2) A comprehensive survey of anomaly detection techniques for high dimensional big data
Anomaly Detection is generally more complicated when you are given higher dimensional data (Curse of dimensionality). This seems a little weird, as usually machine learning improves when you are given more informations. I imagine it as useless features confusing this algorithm. This Paper could be seen as a study of this phenomena.
https://journalofbigdata.springeropen.com/track/pdf/10.1186/s40537-020-00320-x.pdf

3) A Comprehensive Survey on Graph Anomaly Detection with Deep Learning
A lot of datasets that are interesting to AD (For example Email Communications or Trading Data) can be best represented as graphs. This provides unique challeges for AD algorithms.
This is a paper that could either be handled by two students or split up into two. Maybe one considers anomalous graphs, while the other one considers anomalous nodes in graphs.
https://arxiv.org/pdf/2106.07178

4) LOF: identifying Density-Based Local Outliers
LOF is a classical algorithm used in many Applications. This is the original Paper introducing it. As this is a fairly old Paper, you will also find a lot of other sources describing LOF.
https://www.dbs.ifi.lmu.de/Publikationen/Papers/LOF.pdf

5) HiCS: High Contrast Subspaces for Density-Based Outlier Ranking
As most datapoints are quite high-dimensional, it is often the case that some features are useless and could actually take part in hiding the true abnomalities. This Paper suggests a method to select a subspace that filters out unimportant features.
This paper was cowritten by Prof. Müller and might be related to a future Masters thesis.
https://www.ipd.kit.edu/~muellere/publications/ICDE2012.pdf

6) Neural Transformation Learning for Deep Anomaly Detection Beyond Images
While for Image data, certain pre-Transformations(like Rotations) can clearly improve Machine Learning Tasks like Anomaly Detection, this is much less well defined for Time-Series/Tabular data. This Paper tries to solve this by defining learnable Transformations.
https://arxiv.org/pdf/2103.16440

7) A Survey on GANs for Anomaly Detection
GANs are an advanced ML method, normally used to generate really realistic artificial Images (check out https://thispersondoesnotexist.com/ if you have never done so). But these can also be used for anomaly detection. Your task would be to explain how.
https://arxiv.org/pdf/1906.11632


8) Unsupervised Anomaly Detection Ensembles using Item Response Theory
Different AD algorithms are usually better at finding different types of anomalies. To get a more general algorithm you can combine multiple ones into one using Ensembles.
This Paper could be merged together with "Active AD via Ensembles" to be handled by two students.
https://arxiv.org/pdf/2106.06243


9) Active AD via Ensembles...
This Paper tries do to a lot. I suggest that you focus on the active learning part. Alternatively we have also a paper on ensembles so if you both want, you can combine these papers to be worked on by two Students. Active AD extends the task of finding anomalies to the case in which the anomaly status of the training events is not clearly defined. Its focus here lies in minimizing the amount of human work needed to classify a given dataset (given some labels, train a model, find those new events that are unclear, classify those, restart).
I want to note here, that great work on an easy topic if for us the same as good work on a hard topic.
https://arxiv.org/pdf/1901.08930

10) Contextual Outliers
This Paper focuses on interpretability of anomaly detection methods. The Method described works by splitting up the set of normal events into groups and tries to relate any abnormal event to its surrounding normal ones. I would say it is more practical and I want to strongly encurage you to implement this algorithm if choosen.
https://arxiv.org/abs/1711.10589


11) Additive Explanations for Anomalies Detected from Multivariate Temporal Data
Explaining why a given event is anomalous can be as important as detecting it, as it helps to create Trust. This Paper suggests a Method that is based on differentiating between features that contribute more and less.
It is also a quite short paper, so it is extra important to look for other papers.
https://dl.acm.org/doi/abs/10.1145/3357384.3358121
requires vpn, contact me if you have problems with this

12) Interpretable AD for Device Failure
This is an Application Paper. Its complexity comes mostly from the fact that real world data is messy and the Paper addresses ways to mitigate this.
https://arxiv.org/pdf/2007.10088

13) Fast Unsupervised Anomaly Detection in Traffic Videos
This is another Application Paper. Its main complexity is the Input data type, as this uses videos (which are very high dimensional and contain temporal correlations). You will see how good preprocessing can make even a basic algorithm viable for complicated problems.
https://openaccess.thecvf.com/content_CVPRW_2020/papers/w35/Doshi_Fast_Unsupervised_Anomaly_Detection_in_Traffic_Videos_CVPRW_2020_paper.pdf


14) Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding
This is another Application Paper, but this time using a more complicated algorithms from recurrent ML. It tries to monitor the evergrowing amount of spacecrafts for anomalous behaviour.
https://arxiv.org/pdf/1802.04431
initial push 2021-10-17 15:00:05 +02:00			`1) Anomaly Detection for Monitoring`
			`This is more a book and less of a Paper. So it should be perfect for you if you have not that much experience. If focusses on Time Series Analyses, namely the Task of detection when a continous datastream becomes anomalous. This is for examle useful for a machine supervised by sensors that at some point stops working (and thus changes the sensor output)`
			`https://assets.dynatrace.com/content/dam/en/wp/Anomaly-Detection-for-Monitoring-Ruxit.pdf`

			`2) A comprehensive survey of anomaly detection techniques for high dimensional big data`
			`Anomaly Detection is generally more complicated when you are given higher dimensional data (Curse of dimensionality). This seems a little weird, as usually machine learning improves when you are given more informations. I imagine it as useless features confusing this algorithm. This Paper could be seen as a study of this phenomena.`
			`https://journalofbigdata.springeropen.com/track/pdf/10.1186/s40537-020-00320-x.pdf`

			`3) A Comprehensive Survey on Graph Anomaly Detection with Deep Learning`
			`A lot of datasets that are interesting to AD (For example Email Communications or Trading Data) can be best represented as graphs. This provides unique challeges for AD algorithms.`
			`This is a paper that could either be handled by two students or split up into two. Maybe one considers anomalous graphs, while the other one considers anomalous nodes in graphs.`
			`https://arxiv.org/pdf/2106.07178`

			`4) LOF: identifying Density-Based Local Outliers`
			`LOF is a classical algorithm used in many Applications. This is the original Paper introducing it. As this is a fairly old Paper, you will also find a lot of other sources describing LOF.`
			`https://www.dbs.ifi.lmu.de/Publikationen/Papers/LOF.pdf`

			`5) HiCS: High Contrast Subspaces for Density-Based Outlier Ranking`
			`As most datapoints are quite high-dimensional, it is often the case that some features are useless and could actually take part in hiding the true abnomalities. This Paper suggests a method to select a subspace that filters out unimportant features.`
			`This paper was cowritten by Prof. Müller and might be related to a future Masters thesis.`
			`https://www.ipd.kit.edu/~muellere/publications/ICDE2012.pdf`

			`6) Neural Transformation Learning for Deep Anomaly Detection Beyond Images`
			`While for Image data, certain pre-Transformations(like Rotations) can clearly improve Machine Learning Tasks like Anomaly Detection, this is much less well defined for Time-Series/Tabular data. This Paper tries to solve this by defining learnable Transformations.`
			`https://arxiv.org/pdf/2103.16440`

			`7) A Survey on GANs for Anomaly Detection`
			`GANs are an advanced ML method, normally used to generate really realistic artificial Images (check out https://thispersondoesnotexist.com/ if you have never done so). But these can also be used for anomaly detection. Your task would be to explain how.`
			`https://arxiv.org/pdf/1906.11632`


			`8) Unsupervised Anomaly Detection Ensembles using Item Response Theory`
			`Different AD algorithms are usually better at finding different types of anomalies. To get a more general algorithm you can combine multiple ones into one using Ensembles.`
			`This Paper could be merged together with "Active AD via Ensembles" to be handled by two students.`
			`https://arxiv.org/pdf/2106.06243`


			`9) Active AD via Ensembles...`
			This Paper tries do to a lot. I suggest that you focus on the active learning part. Alternatively we have also a paper on ensembles so if you both want, you can combine these papers to be worked on by two Students. Active AD extends the task of finding anomalies to the case in which the anomaly status of the training events is not clearly defined. Its focus here lies in minimizing the amount of human work needed to classify a given dataset (given some labels, train a model, find those new events that are unclear, classify those, restart).
			`I want to note here, that great work on an easy topic if for us the same as good work on a hard topic.`
			`https://arxiv.org/pdf/1901.08930`

			`10) Contextual Outliers`
			`This Paper focuses on interpretability of anomaly detection methods. The Method described works by splitting up the set of normal events into groups and tries to relate any abnormal event to its surrounding normal ones. I would say it is more practical and I want to strongly encurage you to implement this algorithm if choosen.`
			`https://arxiv.org/abs/1711.10589`


			`11) Additive Explanations for Anomalies Detected from Multivariate Temporal Data`
			`Explaining why a given event is anomalous can be as important as detecting it, as it helps to create Trust. This Paper suggests a Method that is based on differentiating between features that contribute more and less.`
			`It is also a quite short paper, so it is extra important to look for other papers.`
			`https://dl.acm.org/doi/abs/10.1145/3357384.3358121`
			`requires vpn, contact me if you have problems with this`

			`12) Interpretable AD for Device Failure`
			`This is an Application Paper. Its complexity comes mostly from the fact that real world data is messy and the Paper addresses ways to mitigate this.`
			`https://arxiv.org/pdf/2007.10088`

			`13) Fast Unsupervised Anomaly Detection in Traffic Videos`
			`This is another Application Paper. Its main complexity is the Input data type, as this uses videos (which are very high dimensional and contain temporal correlations). You will see how good preprocessing can make even a basic algorithm viable for complicated problems.`
			`https://openaccess.thecvf.com/content_CVPRW_2020/papers/w35/Doshi_Fast_Unsupervised_Anomaly_Detection_in_Traffic_Videos_CVPRW_2020_paper.pdf`


			`14) Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding`
			`This is another Application Paper, but this time using a more complicated algorithms from recurrent ML. It tries to monitor the evergrowing amount of spacecrafts for anomalous behaviour.`
			`https://arxiv.org/pdf/1802.04431`