Home | Special Sessions | DASSL Day | Publications

Summer DASSL 2017

Summer DASSL is a special session of DASSL with the specific goal of promoting scientific research and scholarship among undergraduate students, as well as introduce students to modern tools and processes in software and data engineering.

During the special session, students are introduced to a “research process” and are given opportunities to research, discuss, write, present, and solve problems related to data science and data-intensive systems.

All activities are carried out in the context of real-life applications so students learn and apply practical skills that can set them apart from other CS graduates.

Summer DASSL 2017 was held between Jun. 05 2017 and Jul. 14 2017. It focused mainly on the following topics in the context of a real-life application called Gradebook:

The following people participated in Summer DASSL 2017:

The following table lists the topics discussed during Summer DASSL 2017 (excludes implementation activities students carried out):

Date Topic Leader(s)
6-05 “Data Science” Vs “data science” Murthy
6-05 Big data and its characteristics: volume, variety, velocity Murthy
6-05 Mining vs querying; principle components analysis; classification vs clustering Murthy
6-05 Harvard CS109 Intro slides; becomingadatascientist.com Murthy
6-06 Kinds of analytics: descriptive, predictive, prescriptive Murthy
6-06 Introduction to Gradebook: concept, context Murthy
6-07 Bluemix: signup, explore Team
6-07 Data-processing considerations: store, process, present, transfer Murthy
6-09 Git repositories, Bitbucket Rollo
6-09 pgAdmin Figueroa
6-12 Outlier detection: Mahalanobis distance, masking, swamping Griffin, Herger
6-13 Data pivots and pivot queries Murthy
6-14 Markdown to markup: an introduction to Markdown Schloss
6-14 Markdown to HTML: relationship to automata and language theory Murthy
6-14 CSV-Pivot queries Murthy
6-16 Introduction to SQL query optimization Murthy
6-16 Losing data to optimize storage: Gradebook attendance information Murthy
6-19 Outlier detection; demo Griffin, Herger
6-19 K-Nearest Neighbor (KNN) for outlier detection Murthy
6-19 Licensing: options and obligations Bhujwala
6-19 Copyrights, trademarks, attribution Murthy
6-20 Data Analytics concerns: efficiency, security, price, expression, performance, ease of dev & maintenance Murthy
6-20 Intro to native analytics Murthy
6-20 SQL-native KNN Murthy
6-20 ETL: Extract, Transform, Load Murthy
6-20 Importing and exporting CSV data; COPY FROM and COPY TO in Postgres Team
6-20 Introduction to “Issues” in GitHub Murthy
6-21 Rosters: anonymizing and humanizing Murthy
6-21 Tutorial: Using Git Effectively Figueroa, Rollo
6-21 Tutorial: Using Git Effectively Figueroa, Rollo
6-21 Importing OpenClose schedule to Gradebook Team
6-22 What it takes to run a lab like DASSL Team
6-26 GitHub Desktop, try.github.io Boylan
6-26 Demo: GitHub Desktop Boylan
6-26 Standardizing DevOps tool chain in an organization Murthy
6-26 Filling in missing data: alternatives to listing class meeting dates Figueroa, Rollo
6-26 Customer delight as motivation to produce good software Murthy
6-28 Satisfying vs satisficing Murthy
6-28 SchoolTool Vs Gradebook Team
6-28 Introduction to multi-tenancy Murthy
6-28 Data Science boot camps based in NYC Team
6-29 Function characteristics: idempotency, repeatability, side effects Murthy
6-29 Importing student rosters Bella, Figueroa
6-29 Multi-user operations Murthy
6-30 Using JDBC to retrieve data; CBOD and KNN in Java with DB data Griffin, Herger
6-30 Issues with API design; Java class loaded Murthy
6-30 Writing maintainable code Murthy
6-30 RETURNING in Postgres VS OUTPUT in MSSQL Murthy
6-30 Gists on GitHub Murthy
7-03 Social Network Analysis Schloss
7-03 Elements of successful presentations; kinds of examples: simple, comprehensive, counter Murthy
7-03 Scales: ordinal, nominal, interval, ratio Murthy
7-03 Managing merge conflicts in Git Rollo
7-03 Overview of ClassDB: focus on application roles Figueroa
7-05 Representing seasons in Gradebook on a scale: detecting out of sequence imports Murthy, Rollo
7-05 Anonymizing and humanizing data; adding salt to data Murthy
7-05 Type inference in programming languages; auto in C++ Murthy
7-06 Data models: HTML (not exactly a “data model”), XML, JSON Murthy
7-06 Using XML and/or JSON in Gradebook: REST APIs Murthy
7-10 SchoolTool Boylan
7-10 JSON Griffin
7-10 Web API frameworks: REST, SOAP, JSON, Node.js Bhujwala
7-10 High-level architecture of Gradebook: session mgmt., connection pooling Murthy
7-10 Building online portfolios Murthy
7-11 Native-SQL implementation of KNN outlier detection Schloss
7-11 Introduction to R Herger
7-11 Domain-specific languages Murthy
7-11 Library vs Language Murthy
7-12 Union compatibility Murthy
7-12 User-defined functions in Postgres Team
7-13 YAML, JSON, XML: language homomorphism and isomorphism Murthy
7-13 Humanizing student data Bhujwala
7-13 Issues in web apps: where to do what Murthy
7-13 Offloading work from DBMS to web server to client Murthy
7-13 REST API for Gradebook Team
7-13 DBMS functions to back Gradebook REST API Bhujwala, Griffin
7-13 Material UI for Gradebook web client Figueroa
7-13 Gradebook web server Rollo