CPSC 2430: Data Structures: 5 credit class

About

Fundamental data structures including binary search trees, priority queues, hash tables, and heaps. Abstract data type implementation and design. Code analysis using big-O notation, recursion, and sorting algorithms.

Prerequisites

Take CPSC-1230 or CPSC-1430; Minimum grade C; - Must be completed prior to taking this course.

Learning Outcomes

• Analyze the runtime performance of code segments and data structure operations using big-O notation.
• Implement recursive solutions to problems, properly identifying base case(s) and recursive relations.
• Implement recursive and non-recursive algorithms to manipulate binary search trees, heaps, and priority queues; use them appropriately in solving problems.
• Implement algorithms to manipulate hash tables and use them appropriately in solving problems.
• Compare and contrast sorting algorithms, including quicksort, mergesort, and heapsort.

Big-O Notation

Big O notation is a mathematical notation used to describe the upper bound of an algorithm's running time or space complexity in terms of the input size.
It provides an asymptotic analysis, focusing on the behavior of the algorithm as the input size grows towards infinity.

This notation helps in comparing the efficiency of different algorithms by providing a high-level understanding of their performance characteristics.

O(1): Constant time. The running time or space required does not change with the size of the input. Example: Accessing an element in an array by index.

O(log n): Logarithmic time. The running time grows logarithmically with the input size. Example: Binary search in a sorted array.

O(n): Linear time. The running time grows linearly with the input size. Example: Iterating through an array.

O(n log n): Linearithmic time. The running time grows linearly with a logarithmic factor. Example: Efficient sorting algorithms like mergesort and heapsort.

O(n^2): Quadratic time. The running time grows quadratically with the input size. Example: Simple sorting algorithms like bubble sort, selection sort, and insertion sort.

O(2^n): Exponential time. The running time grows exponentially with the input size. Example: Solving the Traveling Salesman Problem using a brute-force approach.

O(n!): Factorial time. The running time grows factorially with the input size. Example: Generating all permutations of a set.
Predict Performance: It helps in predicting the performance of an algorithm as the input size grows.

Compare Algorithms: It provides a standard way to compare the efficiency of different algorithms.

Identify Bottlenecks: It helps in identifying parts of an algorithm that could be optimized for better performance.

Scalability: It gives an understanding of how well an algorithm scales with larger datasets.

Recursion

Recursion in computer science, including data structures, refers to a process where a function calls itself in order to solve a smaller instance of the same problem.

In the context of data structures, recursion often involves algorithms that traverse or manipulate recursive data structures, such as trees and graphs.

Characteristics of Recursive Functions
Base Case: A condition that stops the recursion by providing an exit point. Without a base case, the recursion would continue indefinitely, leading to a stack overflow.
Recursive Case: The part of the function that makes the recursive call to solve a smaller instance of the problem. This typically involves breaking down the problem into smaller subproblems.
Call Stack: Recursion uses the call stack to keep track of function calls. Each recursive call adds a new frame to the call stack, and when the base case is reached, the stack unwinds, and the function returns control back to the caller.
A common example of recursion in data structures is tree traversal, where a function visits each node of a tree in a specific order. There are three primary methods of tree traversal: in-order, pre-order, and post-order.

Binary Search Trees

A binary search tree (BST) is a binary tree data structure where each node has at most two children (left and right), and the key (value) of each node is greater than all keys in its left subtree and less than all keys in its right subtree.

This ordering property allows for efficient searching, insertion, and deletion operations, making binary search trees a fundamental data structure in computer science.

Characteristics of Binary Search Trees
Binary Tree Structure: Each node in a BST has at most two children: a left child and a right child.
Ordering Property: The key of each node is greater than all keys in its left subtree and less than all keys in its right subtree.
This property ensures that the tree remains ordered, facilitating efficient searching operations.

Operations on Binary Search Trees
Search: Find a given key in the binary search tree.
Insertion: Add a new key to the binary search tree while maintaining the binary search tree properties.
Deletion: Remove a key from the binary search tree while preserving the binary search tree properties.
Traversal: Visit all nodes of the binary search tree in a specific order. Common traversal methods include in-order, pre-order, and post-order traversals.
In this BST, if we use a 1-based indexing scheme and store the nodes in an array, it would look like this:
[10, 5, 20, 3, 8, 15, 25]
Here, the root node (10) is stored at index 0, its left child (5) at index 1, its right child (20) at index 2, and so on.

Priority Queues

A priority queue is an abstract data type that operates similar to a regular queue or stack, but with an additional priority associated with each element.

In a priority queue, elements are stored based on their priority, and elements with higher priority are dequeued before elements with lower priority.
Shortest Path Algorithms: Used in Dijkstra's algorithm to find the shortest path in a graph.

Scheduling Algorithms: Used in job scheduling to process tasks based on priority.

Huffman Coding: Used in data compression algorithms to encode characters based on their frequency of occurrence.

Event-driven Simulation: Used in discrete event simulation to schedule and process events based on their occurrence time.

Hash Tables

A hash table is a data structure that stores key-value pairs, allowing for efficient insertion, deletion, and retrieval of elements based on their keys.

Hash tables use a hash function to compute an index (also known as a hash code) into an array of buckets or slots, where each bucket may contain one or more key-value pairs.

Helpful Resources