The Search Problem

The problem we are presented: Given a stream of data, retrieve information of interest.

Examples

  • Website users post to personal pages. Serve content only to friends.
  • Given logs for thousands of weather stations, display weather map for specified date and time.
  • Dog owners request the best pet store, choosing to define their best store either in price, quality, or atmosphere.

All of the data structures we’ve discussed so far have been to solve the search problem. They are used for storing information in schemes that make searching efficient in specific scenarios.

Search Data Structures

Abstract Data Types

Remember that these are Abstract Data Types. This means that we define the behavior, not the implementation.

Also, we’ve defined many possible implementations. Let’s see how these implementations and ADTs interact:

ADTs and Implementations

What does this diagram tell us?

  • Many implementations can be used in implementing different ADTs.
  • The red colored implementations (meaning poor performance) tell us that not all implementations are optimal for the behavior we are trying achieve.

Exercise:

Consider how each of these implementations can be modified in order to accommodate to the behavior attempting to be defined. For example, how would we use a Hash Table to function as a Priority Queue?

Abstraction

Abstraction often happens in layers! Abstract Data Types can often contain two abstract ideas boiling down to one implementation. Let’s consider some examples:

  • Priority Queue ADT: We were attempting to find an implementation that would be efficient for PQ operations. We decided that our Priority Queue would be implemented using a Heap Ordered Tree, but as we saw we had several approaches (1A, 1B, 1C, 2, 3) of representing a tree for heaps.
  • External Chaining Hash Table: This data structure is implemented using an array of buckets, but these buckets can be done using either an ArrayList, Resizing Array, Linked List, or BST.

These two examples tell us that we can often think of an ADT by the use of another ADT, and that ADTs have layers of abstraction, and each defining a behavior that is more specific than the idea that came before it.

Summary

When trying to manage complexity and write programs that are fast, it is important of knowing first of all:

  • What the Abstract Data Types are that will make your life easy?
  • Which implementation will make your implementation faster?
    • Usually a good implementation is good enough.

From Wikipedia:

A data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data.

数据类型主要关心存储怎样的数据,怎样访问它;数据结构主要关心数据在内存中的组织形式。

  • List, Stack, Queue 等是 Abstract Data Type(或者说是 Data Type?算法第四版交换使用 ADT 和 Data Type 这两个词,似乎不作区分)
  • 而 Array 和 Linked List 是 Data Structure,侧重的是数据的存储方式。
  • 每一种 ADT 都既可以用 Array 来实现,也可以用 Linked List 来实现。

In computer science, a tree is a widely used abstract data type (ADT)—or data structure (like BSTs) implementing this ADT.

Also, we can use Array or Linked List to store a tree.