Big Data Moscow 2018

 

Theofilos Kakantousis

Logical Clocks AB, Sweden

BIO

Theofilos Kakantousis is the COO and co-founder of Logical Clocks AB, the main developers of Hops Hadoop. He received his MSc in Distributed Systems from KTH in 2015. He has previously worked as a middleware consultant at Oracle, Greece, as well as a research engineer at SAP AG, Zurich and at RISE SICS AB, Stockholm. He frequently gives talks on Hops Hadoop, and has presented Hops at venues such as Strata San Jose/New York, Big Data Tech Warsaw and BigDataConference Vilnius.

TOPIC

Multi-tenant Deep Learning and Streaming as-a-Service with Hopsworks

Hops is a new European version of Apache Hadoop that introduces the new concepts of projects, datasets and users to Hadoop to provide multi-tenant Deep Learning-as-a-Service and Streaming as-a-Service. Our platform for managing datasets and running jobs, called Hopsworks, builds on Hops concepts and is an entirely UI-driven environment implemented with only open-source software. In this talk we will discuss the challenges and experiences in building a secure platform that runs both machine learning and streaming applications using a plethora of technologies. We show how Hopsworks provides Distributed Deep Learning-as-a-Service with TensorFlow, Uber’s Horovod and Yahoo’s TensorFlowOnSpark and demonstrate how data scientists can perform large-scale hyperparameter optimization, monitor model training using TensorBoard and manage their experiments using the Hopsworks Experiment Service. We demonstrate how we run streaming applications on both Spark and Flink with Kafka over YARN and how we run SQL on big data with Hive and SparkSQL. We also show how we use the ELK stack (Elasticsearch, Logstash, and Kibana) and SparkUI for logging and debugging Spark applications, how we use Grafana to monitor Spark applications and finally how Jupyter notebooks provide interactive visualizations and charts to end-users. Moreover, we will show how Hopsworks simplifies discovering and downloading huge datasets using Dela, a custom peer-to-peer sharing tool. Users can, within minutes, install the platform, discover curated important datasets and download them to either apply their business logic with a streaming application or train Deep Neural networks using TensorFlow. We will also discuss our experiences running Deep Learning-as-a-Service and Streaming-as-a-Service on a cluster in Sweden with over 400 users (as of mid 2018).

Date: October 11, 2018