Jobs

We are looking for motivated graduates and undergraduates to join our lab. The job descriptions are below:

Position 1

The DICE lab seeks research assistants in the area of systems and machine learning. The graduate assistant will contribute to projects that examine the application of ML to systems problems of auditing, path tracking, and program differencing. Students will get experience working with graph neural networks, systems tracing methods, and graph analysis. Interested students are welcome to contact Prof. Tanu Malik (tanu dot malik at depaul dot edu).

Position 2

The DICE lab at the School of Computing, DePaul University seeks research associates working at the interface of cloud computing and data provenance. The graduate assistant will contribute to projects that examine the efficiency of data containerization using data provenance. The associate will work with scientific workflow systems such as Pegasus, HTCondor and use novel containerization methods (Sciunit) to examine several issues such as resource allocation and performance reproducibility. The project is funded by National Aeronautics and Space Agency and National Science Foundation. We expect the associate to contribute to software development, research, and publications. If interested, please contact Tanu dot malik at depaul dot edu.

Here is a sample of project descriptions for which which are ongoing and are looking for graduate assistants:

Title: Auditing Provenance Across Containers

Short Description: The provenance of a piece of data describes how that data was obtained. System provenance refers to the sequence of system calls that generate or derive a file in a Linux system. Current methods to generate system provenance do so within a non-isolated context i.e they do not account for containerized processes. This practicum will explore auditing and management of provenance in isolated contexts. The objective will be to experiment with different provenance auditing methods Viz., the Linux ptrace utility and Linux on different system calls that establish isolation contexts. These experiments will help us design an auditing framework across containers. We will develop different methods to store provenance, i.e. within containers or in a central location. The practicum will provide hands-on experience with the basics of Docker containers and several Linux utilities.

Expected Deliverables: The expected deliverable will be a document describing different isolation contexts and a design document describing how to audit provenance in those contexts.

Related work: https://www.linuxjournal.com/article/6100 https://lwn.net/Articles/531114/

Prerequisites: A sound knowledge of operating systems and file systems.

Title: Bandit-based algorithms for ML workflow selection.

Short Description: Machine Learning (ML) has been successfully applied to a wide range of domains and applications. Since the number of ML applications is growing, there is a need for tools that boost the data scientist’s productivity. One of the time-consuming tasks in ML is selecting, combining and configuring algorithms to the task at hand often known as the workflow selection problem. Currently, workflow selection is often achieved manually. In this practicum, we will explore methods that automate the same. In particular, we will understand how multi-armed bandit-based algorithms apply to the workflow selection. We will then apply such an algorithm within a data-intensive workflow selection framework.

Expected Deliverables: A description of the problem statement, background on multi-armed bandit-based algorithms, and its implementation in the data-intensive workflow selection framework.

Related Work: https://lilianweng.github.io/posts/2018-01-23-multi-armed-bandit/ https://www.tensorflow.org/agents/tutorials/intro_bandit

Pre-requisites: Proficiency in Python and data science workflows, Interest in advanced algorithmic thinking and solving optimization problems.