1st International Workshop on Data Analytics and Machine Learning Made Simple

Co-located with EDBT 2021, Nicosia, CyprusOnline, March 23, 2021

Call for Papers


There exists a plethora of current applications, with widely different characteristics though, that are generating and need to process massive amounts of static or streaming data. For example, Data Lakes gather large amounts of diverse data from a multitude of data sources with the aim to enable data analysts to perform ad hoc, self-service analytics, and to train machine learning models, reducing the time from data to insights. These operations are also particularly challenging in the case of applications that are processing streaming Big Data. Achieving this goal requires addressing various challenges relating to data volume, velocity, dynamicity, heterogeneity, and potentially (geo-)distributed data processing.

Although there exists a plethora of techniques, algorithms and tools to manage, query and analyze various types of data, they typically require a high degree of data management skills and expertise, as well as significant time and effort for data preparation, parameter tuning and design and implementation of data analytics and machine learning pipelines.

The aim of the SIMPLIFY workshop is to bring together computer scientists with interests in this field to present recent innovations, find topics of common interest and to stimulate further development of new approaches that greatly simplify the work of a data analyst when performing data analytics, or when employing machine learning algorithms, over Big Data.


Topics of interest include (but are not limited to):
      Novel architectures for data analytics and ML over data lakes
      Novel architectures for data analytics and online ML over streaming data
      Query processing over heterogeneous data
      Query processing over geo-distributed data
      Query optimization of data processing workflows
      Algorithms for mining and analytics over heterogeneous data
      Algorithms for online machine learning and data mining
      Similarity search and entity resolution
      Interactive data exploration
      Visual analytics over heterogeneous data
      Deep learning platforms
      Application papers demonstrating the impact of techniques relevant to SIMPLIFY

Submission guidelines

We invite submissions of novel research, completed or in-progress work, vision, and system papers. The page limit for regular research papers is 6 pages. Additionally, we welcome submission of short papers, up to 4 pages, of the following types: (a) papers that describe ongoing work that has not yet reached the maturity required for a full research paper; (b) vision papers that describe a vision for the future of the field; (c) system/application papers and demos.

Papers must present original work and not have been submitted or accepted for publication in any other workshop, conference or journal.

Submitted papers must follow the ACM Proceedings Format (adapted template for EDBT 2021 can be found here) and should be submitted electronically as PDF documents using the online EasyChair submission system:
All workshop papers will be indexed by DBLP and will be published online at CEUR.

Important Dates

      Submission deadline: December 22, 2020 December 29, 2020
      Notification to authors: January 22, 2021 January 25, 2021
      Camera-ready deadline: February 1, 2021 February 8, 2021


Workshop Chairs

      Antonios Deligiannakis, Technical University of Crete
      Manolis Koubarakis, National and Kapodistrian University of Athens
      Dimitris Skoutas, Athena Research Center

Program Committee

      Alexander Artikis, NCSR "Demokritos"
      Konstantina Bereta, National and Kapodistrian University of Athens
      Daniele Bonetta, Oracle Labs
      Bikash Chandra, Ecole Polytechnique Fédérale de Lausanne
      Nikos Giatrakos, Athena Research Center
      Damien Graux, ADAPT Centre and Trinity College Dublin
      Asterios Katsifodimos, Delft University of Technology
      Georgia Koutrika, Athena Research Center
      Matteo Lissandrini, Aalborg University
      Davide Mottin, Aarhus University
      Ioannis Mytilinis, Ecole Polytechnique Fédérale de Lausanne
      Eirini Ntoutsi, L3S Research Center
      Odysseas Papapetrou, Eindhoven University of Technology
      Matthias Renz, Christian-Albrechts-Universität zu Kiel
      Dimitris Sacharidis, Vienna University of Technology
      Alkis Simitsis, Athena Research Center
      Giovanni Simonini, Università di Modena e Reggio Emilia
      Thanasis Vergoulis, Athena Research Center
      Nikolay Yakovets, Eindhoven University of Technology


Time in GMT+1 (Central Europe)

Session 1
09:00-09:05 Welcome
09:05-09:45 Keynote
Session 2
10:00-10:15 Scale-independent Data Analysis with Database-backed Dataframes: a Case Study
Phanwadee Sinthong, Michael Carey and Yuhan Yao
10:15-10:30 What's Mine is Yours, What's Yours is Mine: Simplifying Significance Testing With Big Data
Karan Matnani, Valerie Liptak and George Forman
10:30-10:45 Simplifying p-value Calculation for the Unbiased microRNA Enrichment Analysis, Using ML-techniques
Konstantinos Zagganas, Maria Lioli, Thanasis Vergoulis and Theodore Dalamagas
10:45-11:00 Storage Management in Smart Data Lake
Haoqiong Bian, Bikash Chandra, Ioannis Mytilinis and Anastasia Ailamaki
11:00-11:15 Easy Spark
Ylaise van den Wildenberg, Wouter W.L. Nuijten and Odysseas Papapetrou
11:15-11:30 MRbox: Simplifying Working with Remote Heterogeneous Analytics and Storage Services via Localised Views
Athina Kyriakou and Iraklis Angelos Klampanos
Session 3
11:45-12:00 Multi-Attribute Similarity Search for Interactive Data Exploration
Kostas Patroumpas, Alexandros Zeakis, Dimitrios Skoutas and Roberto Santoro
12:00-12:15 Speculative Execution of Similarity Queries: Real-Time Parameter Optimization through Visual Exploration
Thilo Spinner, Udo Schlegel, Martin Schall, Fabian Sperrle, Rita Sevastjanova, Beatrice Gobbo, Julius Rauscher, Mennatallah El-Assady and Daniel A. Keim
12:15-12:30 An Empirical Evaluation of Early Time-Series Classification Algorithms
Evgenios Kladis, Charilaos Akasiadis, Evangelos Michelioudakis, Elias Alevizos and Alexandros Artikis
12:30-12:45 Weighted Load Balancing Mechanisms over Streaming Big Data for Online Machine Learning
Petros Petrou, Sophia Karagiorgou and Dimitrios Alexandrou
12:45-13:00 Simplifying Impact Prediction for Scientific Articles
Thanasis Vergoulis, Ilias Kanellos, Giorgos Giannopoulos and Theodore Dalamagas