Data Structure
Data structure implies the logical arrangement of data. This is achieved by arranging data, often in a tabular format. Tables are comprised of columns and rows. Columns represent fields (sometimes called items). Fields describe some domain of interest. For example,
|
Restaurant |
Address |
Cuisine |
|
|
|
|
|
|
|
|
Rows represent records. Records contain the attributional data in a given domain. For example,
|
Restaurant |
Address |
Cuisine |
|
Romano’s |
334 Washington St. |
Italian |
|
Gyro Palace |
1449 Dalia Ave. |
Greek |
How we organize our data affects how we can use that data. The above table can support the query what kind of cuisine does Romano’s serve. However, it cannot support the query what restaurant is on Washington St. Why?
One way to organize data is to create a database. Databases provide a structure in which a collection of inter-related data can be stored, managed, and retrieved. Database Management Systems (DBMS’s) provide:
Advantages of DBMS’s:
One form of DBMS is a Relational database management system (RDBMS). RDBMS’s are structured in a tabular format. Any field in a table may be a key for accessing data in another table. This permits all objects and attributes to be related to each other. RDBMS’s are frequently used in GIS because of their simple and flexible structure. They also support complex relationships common among real-world geographic objects.
RDBMS’s
Relational database management systems are made up of 2 dimensional tables that can be manipulated by operations. Unlike a spreadsheet, RDBMS tables can be linked together so that complex information can be stores and retrieved in a more efficient manner.
One way of creating relationships is through the Entity-Relationship Model.
Entity-Relationship Model (ER Model) is a data model in which information stored in the database is viewed as sets of entities and sets of relationships among entities.
Entity: something that exists and can be distinguished from other entities.
Examples:
customer entities with unique social security numbers
account entities with unique account numbers
Entity set: a set of entities of the same type.
Example: all of the account entities for a bank
Attribute: a characteristic of an entity
Example: a customer entity might have attributes such as: customer name, social security number, address, ...
Representation of Entities and Entity Sets
An entity consists of a value for each of its attributes.
Entities are written as a record (like a row in a table).
Example: an entity with attributes name, social security number, and address
Customer
|
Elvis Presley |
121212121 |
Graceland |
Note that this representation forces attributes to be written in the same order.
Relationships and Relationship Sets
Relationship: an association among 2 or more entities
Example: the relationship between a customer and the model of car purchased at the dealership
Customer
|
Elvis Presley |
121212121 |
Graceland |
And
Purchase history
|
Elvis Presley |
4/25/50 |
˝ ton pickup |
|
Elvis Presley |
8/2/68 |
Cadillac convertible |
Relationship set: a set of relationships of the same type
Example: the customers and types of cars they purchased
Cardinalities
A mapping cardinality is a data constraint that specifies how many entities an entity can be related to in a relationship set.
Example: each customer can purchase as many automobile models as they want, but can have only one favorite.
A binary relationship set is a relationship set on two entity sets. Mapping cardinalities on binary relationship sets are simplest.
Keys
Relationships are created by means of a key. Keys uniquely identify entities that comprise the relationship.
Example: the name field is a unique identifier in both entity sets above.
Analytical Operations
Queries
Data query is the process of asking questions of your data. In this context, you can ask:
There are 2 types of data query, attributional and spatial.
Attributional
Attributional queries are based on feature attributes. These queries include selecting features by some attribute so they may be redisplayed or reclassified. Queries can be descriptive in nature, or can use logical or Boolean operators.
For example:
What type of food does Romano’s serve?
Show all areas that are greater than 5 acres.
Select zoning districts that allow residential or commercial uses.
Logical operators are:
Boolean operators are:
Spatial
We can also make use of spatial queries. Spatial queries answer questions of adjacency, containment, buffering (sphere of influence), union, and intersection.
Overlay analysis
The next two spatial queries are often referred to as overlay analysis.
Map algebra
The remaining form of spatial analysis makes use of arithmetic operations (+, -, *, and /). This is called map algebra. Map algebra is essentially doing math with maps. Maps must be in a raster format. The values in each cell are added, subtracted, multiplied, or divided.
You can use map algebra to set masks. That is, screen out data values that are not of interest. This is often done by using binary maps (values of 0 or 1), and by making use of No Data values. You can also create unique cell values that can be traced back to identify what combination of values created the new value.
Summarizing data
When we summarize data we are essentially asking how many times does X occur. We can also include with that several operations on numerical data. For example, we can find out how many lakes are > 10 acres and note the sum of the areas of the lakes, the minimum or maximum lake size, or average lake size.
We can also do some simple statistical calculations on data. Mainly the basic measures of central tendency are available (mean, median, and mode). Additionally we can calculate standard deviation and variance.
Assignment(s):