# Data Science

*All courses, faculty listings, and curricular and degree requirements described herein are subject to change or deletion without notice. *

## Courses

*For course descriptions not found in the *UC San Diego General Catalog 2020–21*, please contact the department for more information.*

## Lower Division

DSC 10. Principles of Data Science (4)

This introductory course develops computational thinking and tools necessary to answer questions that arise from large-scale datasets. This course emphasizes an end-to-end approach to data science, introducing programming techniques in Python that cover data processing, modeling, and analysis. **Prerequisites:** none.

DSC 20. Programming and Basic Data Structures for Data Science (4)

Provides an understanding of the structures that underlie the programs, algorithms, and languages used in data science by expanding the repertoire of computational concepts introduced in DSC 10 and exposing students to techniques of abstraction. Course will be taught in Python and will cover topics including recursion, higher-order functions, function composition, object-oriented programming, interpreters, classes, and simple data structures such as arrays, lists, and linked lists. **Prerequisites:** DSC 10. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 30. Data Structures and Algorithms for Data Science (4)

Builds on topics covered in DSC 20 and provides practical experience in composing larger computational systems through several significant programming projects using Java. Students will study advanced programming techniques including encapsulation, abstract data types, interfaces, algorithms and complexity, and data structures such as stacks, queues, priority queues, heaps, linked lists, binary trees, binary search trees, and hash tables. **Prerequisites:** DSC 20. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 40A. Theoretical Foundations of Data Science I (4)

This course, the first of a two-course sequence (DSC 40A-B), will introduce the theoretical foundations of data science. Students will become familiar with mathematical language for expressing data analysis problems and solution strategies, and will receive training in probabilistic reasoning, mathematical modeling of data, and algorithmic problem solving. DSC 40A will introduce fundamental topics in machine learning, statistics, and linear algebra with applications to data analysis. DSC 40A-B connect to DSC 10, 20, and 30 by providing the theoretical foundation for the methods that underlie data science. **Prerequisites:** DSC 10, MATH 20C or MATH 31BH, and MATH 18 or MATH 20F or MATH 31AH. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 40B. Theoretical Foundations of Data Science II (4)

The sequence DSC 40A-B introduces the theoretical foundations of data science and covers the following topics: mathematical language for expressing data analysis problems and solution strategies, probabilistic reasoning, mathematical modeling of data, and algorithmic problem solving. DSC 40B, the second course in the sequence, introduces fundamental topics in combinatorics, graph theory, probability, and continuous and discrete algorithms with applications to data analysis. **Prerequisites:** DSC 20 and 40A. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 80. The Practice and Application of Data Science (4)

The marriage of data, computation, and inferential thinking, or “data science,” is redefining how people and organizations solve challenging problems and understand the world. This course bridges lower- and upper-division data science courses as well as methods courses in other fields. Students master the data science life-cycle and learn many of the fundamental principles and techniques of data science spanning algorithms, statistics, machine learning, visualization, and data systems. **Prerequisites:** DSC 30 and DSC 40A. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 90. Seminar in Data Science (2)

Students will learn about a variety of topics in data science through interactive presentations from faculty and industry professionals. **Prerequisites:** DSC 10. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 95. Tutor Apprenticeship in Data Science (2)

Students will receive training in skills and techniques necessary to be effective tutors for data science courses. Students will also gain practical experience in tutoring students on data science topics. **Prerequisites:** DSC 10. Students must have applied for and been accepted as a tutor for a DSC course for the first time; enrollment in DSC 95 is required for these students.

DSC 96. Workshop in Data Science (2)

Students will explore topics and tools relevant to the practice of data science in a workshop format. The instructor works with students on guided projects to help students acquire knowledge and skills to complement their course work in the core data science classes. Topics vary from quarter to quarter. Students are strongly recommended to enroll in either DSC 10 or DSC 20 concurrently. **Prerequisites:** Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 97. Internship in Data Science (2 or 4)

Individual research on a topic related to data science, by special arrangement with and under the direction of a UC San Diego faculty member, in connection with an internship at an organization. Internship work will inform but not necessarily define the research topic. The research topic is expected to promote the study of the principles and techniques involved in the internship work. May be taken for credit up to two times. **Prerequisites:** MATH 20C or MATH 31BH and MATH 18 or MATH 31AH and DSC 20 and DSC 40A. Completion of thirty units at UC San Diego with a university GPA of 3.0. Special Studies Form, consent of the instructor, and approval of the department required. Priority enrollment is given to data science majors DS25. Restricted to first-year and sophomore level students.

DSC 98. Directed Group Study in Data Science (2 or 4)

Students will investigate a topic in data science through directed reading, discussion, and project work under the supervision of a faculty member. May be taken for credit up to two times. **Prerequisites:** MATH 20C or MATH 31BH and MATH 18 or MATH 31AH and DSC 20 and DSC 40A. Completion of thirty units at UC San Diego with a university GPA of 3.0. Special Studies Form, consent of the instructor, and approval of the department required. Priority enrollment is given to data science majors DS25. Restricted to first-year and sophomore level students.

DSC 99. Independent Study in Data Science (2 or 4)

Students will participate in independent study or research in data science under the direction of a UC San Diego faculty member. May be taken for credit up to two times. **Prerequisites:** MATH 20C or MATH 31BH and MATH 18 or MATH 31AH and DSC 20 and DSC 40A. Completion of thirty units at UC San Diego with a university GPA of 3.0. Special Studies Form, consent of the instructor, and approval of the department required. Priority enrollment is given to data science majors DS25. Restricted to first-year and sophomore level students.

## Upper Division

DSC 100. Introduction to Data Management (4)

This course is an introduction to storage and management of large-scale data using classical relational (SQL) systems, with an eye toward applications in data science. The course covers topics including the SQL data model and query language, relational data modeling and schema design, elements of cost-based query optimizations, relational data base architecture, and database-backed applications. **Prerequisites: **DSC 80 and DSC 40B. Restricted to students with upper-division standing. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 102. Systems for Scalable Analytics (4)

This course introduces the principles of computing systems and infrastructure for scaling analytics to large datasets. Topics include memory hierarchy, distributed systems, model selection, heterogeneous datasets, and deployment at scale. The course will also discuss the design of systems such as MapReduce/Hadoop and Spark, in conjunction with their implementation. Students will also learn how dataflow operations can be used to perform data preparation, cleaning, and feature engineering. **Prerequisites: **DSC 100. Restricted to students with upper-division standing. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 104. Beyond Relational Data Management (4)

The course will introduce a variety of No-SQL data formats, data models, high-level query languages, and programming abstractions representative of the needs of modern data analytic tasks. Topics include hierarchical graph database systems, unrestricted graph database systems, array databases, comparison of expressive power of the data models, and parallel programming abstractions, including Map/Reduce and its descendants. **Prerequisites: **DSC 100. Restricted to students with upper-division standing. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 106. Introduction to Data Visualization (4)

Data visualization helps explore and interpret data through interaction. This course introduces the principles, techniques, and algorithms for creating effective visualizations. The course draws on the knowledge from several disciplines including computer graphics, human-computer interaction, cognitive psychology, design, and statistical graphics and synthesizes relevant ideas. Students will design visualization systems using D3 or other web-based software and evaluate their effectiveness. It is highly recommended that students take MATH 189 prior to taking DSC 106. **Prerequisites: **DSC 80. Restricted to students with upper-division standing. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 120. Signal Processing for Data Analysis (4)

This course will focus on ideas from classical and modern signal processing, with the main themes of sampling continuous data and building informative representations using orthonormal bases, frames, and data dependent operators. Topics include sampling theory, Fourier analysis, lossy transformations and compression, time and spatial filters, and random Fourier features and connections to kernel methods. Sources of data include time series and streaming signals and various imaging modalities. **Prerequisites: **MATH 18 or MATH 31AH and MATH 20C and DSC 40B. Restricted to students with upper-division standing. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 140A. Probabilistic Modeling and Machine Learning (4)

The course covers learning and using probabilistic models for knowledge representation and decision-making. Topics covered include graphical models, temporal models, and online learning, as well as applications to natural language processing, adversarial learning, computational biology, and robotics. Prior completion of MATH 181A is strongly recommended. **Prerequisites: **DSC 40B, DSC 80, and ECE 109 or ECON 120A or MAE 108 or MATH 180A or MATH 183 or MATH 186. Restricted to students with upper-division standing. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 155. Hidden Data in Random Matrices (4)

Rigorous treatment of principal component analysis, one of the most effective methods in finding signals amidst the noise of large data arrays. Topics include singular value decomposition for matrices, maximal likelihood estimation, least squares methods, unbiased estimators, random matrices, Wigner’s semicircle law, Markchenko-Pastur laws, universality of eigenvalue statistics, outliers, the BBP transition, applications to community detection, and stochastic block model. **Prerequisites:** MATH 180A and (MATH 18 or MATH 31AH). Completion of MATH 102 is encouraged but not required. Students will not receive credit for both MATH 182 and DSC 155.

DSC 160. Data Science and the Arts (4)

This course addresses the intersection of data science and contemporary arts and culture, exploring four main themes of authorship, representation, visualization, and data provenance. The course is not solely an introduction to data science techniques nor merely an arts practice course, but explores significant new possibilities for both fields arising from their intersection. Students will examine problems from complementary perspectives of artist-researchers and data scientists. **Prerequisites:** DSC 80.

DSC 161. Text as Data (4)

This class explores statistical and computational methods to enable students to use text as a data source in the social sciences. Hands-on examples will equip students to work with text data in final projects. **Prerequisites:** ECON 5 or POLI 170A or POLI 171 or POLI 30 or POLI 30D or POLI 5 or POLI 5D. Students will not receive credit for both POLI 176 and DSC 161. Restricted to junior and senior level students. Restricted to students with upper-division standing.

DSC 167. Fairness and Algorithmic Decision-Making (4)

This course examines the greater context under which the practice of data science exists and explores concrete ways these issues surface in technical work. Students learn frameworks for understanding how individuals relate to social institutions, how to use them to identify how issues of fairness arise in the “life of a data scientist,” and use them to propose and critique potential solutions. ** Prerequisites:** DSC 80.

DSC 170. Spatial Data Science and Applications (4)

Spatial data science is a set of concepts and methods that deal with accessing, managing, visualizing, analyzing, and reasoning about spatial data in applications where location, shape and size of objects, and their mutual arrangement are important. This upper-division course explores advanced data science concepts for spatial data, introducing students to principles and techniques of spatial data analysis, including geographic information systems, spatial big data management, and geostatistics. **Prerequisites: **DSC 80. Restricted to students with upper-division standing.

DSC 180A. Data Science Project I (4)

In this two-course sequence students will investigate a topic and design a system to produce statistically informed output. The investigation will span the entire lifecycle, including assessing the problem, learning domain knowledge, collecting/cleaning data, creating a model, addressing ethical issues, designing the system, analyzing the output, and presenting the results. 180A deals with research, methodology, and system design. Students will produce a research summary and a project proposal. **Prerequisites: **DSC 102 and MATH 189 and CSE 151 or CSE 158. Restricted to students with upper-division standing. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 180B. Data Science Project II (4)

In this two-course sequence students will investigate a topic and design a system to produce statistically informed output. The investigation will span the entire lifecycle, including assessing the problem, learning domain knowledge, collecting/cleaning data, creating a model, addressing ethical issues, designing the system, analyzing the output, and presenting the results. 180B will consist of implementing the project while studying the best practices for evaluation. **Prerequisites: **DSC 106 and DSC 180A. Restricted to students with upper-division standing. Restricted to students within the DS25 major. All other students will be allowed as space permits.

DSC 190. Topics in Data Science (4)

Topics of special interest in data science. Topics vary from quarter to quarter. May be taken for credit up to four times when topic varies. **Prerequisites: **department and instructor approval required to monitor enrollment and to ensure that students have the sufficient educational background for a given topic. Restricted to students with upper-division standing.

DSC 191. Seminar in Data Science (1 or 2)

A seminar course on topics of current interest in data science. Topics may vary from quarter to quarter. May be taken for credit three times. **Prerequisites: **restricted to students with upper-division standing. Department and instructor approval is required to monitor enrollment and to ensure that students have the sufficient educational background for a given topic.

DSC 197. Data Science Internship (1–4)

Directed study and research at laboratories/institutions outside of campus. **Prerequisites: **restricted to students with upper-division standing. Consent of the instructor and approval of the department. An application for Special Studies must be filed with the Registrar’s office after approval from the instructor and the department chair.

DSC 198. Directed Group Study in Data Science (2 or 4)

Data science topics whose study involves reading and discussion by a small group of students under supervision of a faculty member. May be taken for credit up to two times. **Prerequisites: **restricted to students with upper-division standing. Consent of the instructor and approval of the department. Department stamp required. An application for Special Studies must be filed with the Registrar’s office after approval from the instructor and the department chair.

DSC 199. Independent Study for Data Science Undergraduates (2 or 4)

Independent reading or research on a topic related to data science by special arrangement with a faculty member. May be taken for credit up to two times. **Prerequisites: **restricted to students with upper-division standing. Consent of the instructor and approval of the department. An application for Special Studies must be filed with the Registrar’s Office after approval from the instructor and the department chair.

DSC 290. Seminar in Data Science (1 or 2)

A graduate seminar course in which topics of special interest in data science will be presented by faculty or by graduate students under faculty direction. Restricted to graduate level students. May be repeated for credit twenty-four times as topics vary.

DSC 291. Topics in Data Science (4)

Topics of special interest in data science. Topics may vary quarter to quarter. Restricted to graduate level students. May be taken for credit up to nine times.

DSC 500. Teaching Assistantship (2 or 4)

A course in which teaching assistants are aided in learning proper teaching methods by means of supervision of their work by the faculty: handling of discussions, preparation and grading of examinations and other written exercises, and student relations. Number of units for credit depends on number of hours devoted to class or section assistance. May be taken for credit up to seventy-two units. **Prerequisites: **DSC 500 is for selected teaching assistants only; therefore, consent of instructor is required.

DSC 599. Teaching Methods in Data Science (2)

Training in teaching methods in the field of data science. This course examines theoretical and practical communication and teaching techniques particularly appropriate to data science. **Prerequisites: **Consent of faculty required. Only graduate students who are TAing for the first time in the data science program are eligible to enroll.