Data Structure

Data structure implies the logical arrangement of data. This is achieved by arranging data, often in a tabular format. Tables are comprised of columns and rows. Columns represent fields (sometimes called items). Fields describe some domain of interest. For example,

Restaurant

Address

Cuisine

 

 

 

 

 

 

Rows represent records. Records contain the attributional data in a given domain. For example,

Restaurant

Address

Cuisine

Romano’s

334 Washington St.

Italian

Gyro Palace

1449 Dalia Ave.

Greek

How we organize our data affects how we can use that data. The above table can support the query what kind of cuisine does Romano’s serve. However, it cannot support the query what restaurant is on Washington St. Why?

One way to organize data is to create a database. Databases provide a structure in which a collection of inter-related data can be stored, managed, and retrieved. Database Management Systems (DBMS’s) provide:

Advantages of DBMS’s:

One form of DBMS is a Relational database management system (RDBMS). RDBMS’s are structured in a tabular format. Any field in a table may be a key for accessing data in another table. This permits all objects and attributes to be related to each other. RDBMS’s are frequently used in GIS because of their simple and flexible structure. They also support complex relationships common among real-world geographic objects.

RDBMS’s

Relational database management systems are made up of 2 dimensional tables that can be manipulated by operations. Unlike a spreadsheet, RDBMS tables can be linked together so that complex information can be stores and retrieved in a more efficient manner.

One way of creating relationships is through the Entity-Relationship Model.

Entity-Relationship Model (ER Model) is a data model in which information stored in the database is viewed as sets of entities and sets of relationships among entities.

Entity: something that exists and can be distinguished from other entities.

Examples:

customer entities with unique social security numbers

account entities with unique account numbers

Entity set: a set of entities of the same type.

Example: all of the account entities for a bank

Attribute: a characteristic of an entity

Example: a customer entity might have attributes such as: customer name, social security number, address, ...

Representation of Entities and Entity Sets

An entity consists of a value for each of its attributes.

Entities are written as a record (like a row in a table).

Example: an entity with attributes name, social security number, and address

Customer

Elvis Presley

121212121

Graceland

Note that this representation forces attributes to be written in the same order.

Relationships and Relationship Sets

Relationship: an association among 2 or more entities

Example: the relationship between a customer and the model of car purchased at the dealership

Customer

Elvis Presley

121212121

Graceland

And

Purchase history

Elvis Presley

4/25/50

˝ ton pickup

Elvis Presley

8/2/68

Cadillac convertible

Relationship set: a set of relationships of the same type

Example: the customers and types of cars they purchased

Cardinalities

A mapping cardinality is a data constraint that specifies how many entities an entity can be related to in a relationship set.

Example: each customer can purchase as many automobile models as they want, but can have only one favorite.

A binary relationship set is a relationship set on two entity sets. Mapping cardinalities on binary relationship sets are simplest.

Keys

Relationships are created by means of a key. Keys uniquely identify entities that comprise the relationship.

Example: the name field is a unique identifier in both entity sets above.

Analytical Operations

Queries

Data query is the process of asking questions of your data. In this context, you can ask:

There are 2 types of data query, attributional and spatial.

Attributional

Attributional queries are based on feature attributes. These queries include selecting features by some attribute so they may be redisplayed or reclassified. Queries can be descriptive in nature, or can use logical or Boolean operators.

For example:

What type of food does Romano’s serve?

Show all areas that are greater than 5 acres.

Select zoning districts that allow residential or commercial uses.

Logical operators are:

Boolean operators are:

Spatial

We can also make use of spatial queries. Spatial queries answer questions of adjacency, containment, buffering (sphere of influence), union, and intersection.

Overlay analysis

The next two spatial queries are often referred to as overlay analysis.

Map algebra

The remaining form of spatial analysis makes use of arithmetic operations (+, -, *, and /). This is called map algebra. Map algebra is essentially doing math with maps. Maps must be in a raster format. The values in each cell are added, subtracted, multiplied, or divided.

You can use map algebra to set masks. That is, screen out data values that are not of interest. This is often done by using binary maps (values of 0 or 1), and by making use of No Data values. You can also create unique cell values that can be traced back to identify what combination of values created the new value.

Summarizing data

When we summarize data we are essentially asking how many times does X occur. We can also include with that several operations on numerical data. For example, we can find out how many lakes are > 10 acres and note the sum of the areas of the lakes, the minimum or maximum lake size, or average lake size.

We can also do some simple statistical calculations on data. Mainly the basic measures of central tendency are available (mean, median, and mode). Additionally we can calculate standard deviation and variance.

Assignment(s):