1 of 47

Intro to GIS

Lesson 7: Attribute Data

1

2 of 47

Objectives

  • Explain how tabular data are stored and queried
  • Identify cardinality of tables and create joins between tables
  • Explore data in tables using statistics and summarized statistics
  • Define appropriate field types and properties for attributes
  • Edit and calculate fields in tables
  • Use spreadsheet data in ArcGIS Pro

2

3 of 47

3

4 of 47

Types of tables

G I S tables come in two main types

  • Attribute tables store data associated with a spatial feature class.
  • Standalone tables simply store tabular data from any source.
    • Excel, text files, C S V files, and database files are common sources of standalone tables.

4

5 of 47

Tables

Source: Esri

  • A table is a data structure to store attributes
  • This table stores attributes for a counties feature class
  • Rows are called records
  • Columns are called fields
  • The Feature I D or Object I D is a unique integer managed by the software to identify the record

5

6 of 47

Database management systems

6

7 of 47

Flat file D B M S

A flat file database

  • Stores data as rows of information in files.
  • Is simple and robust.
  • Is on efficient for search and query.

Customers

Service calls

Electric usage

Service personal

7

8 of 47

Hierarchical D B M S

Stores data in multiple tables

The tables have defined parent-child relationships

  • Customer table is linked to a table of service calls.

Defines pre-set hierarchy of table relationships

  • Service personnel are linked to service calls but not to customers.

Very efficient for specific queries, but the allowable queries are limited by the structure

8

9 of 47

Relational D B M S

Stores data in multiple tables

Table relationships are defined as needed

  • Service personnel can be linked to customers if needed.

Very flexible

Ideal for open-ended applications when types of queries not known beforehand

Most common type used in G I S applications

9

10 of 47

Joining tables

Source: Esri

  • Table joins allow two tables to be used as a single table
  • Records are linked using a common field, or key, like STATE_F I P S
  • The target table receives the additional information
  • The join table provides the additional information

10

11 of 47

Multiple joins

  • More than two tables may be joined to a target table
  • Two tables with only a state name or state abbreviation field can be joined using a common table with both fields
  • The same table must be the target for all multiple joins (a table cannot be both a target table and a join table)

11

12 of 47

One-to-one joins

Source: Esri

  • Joining must take into account the cardinality between the two tables, or how many join records match each target record
  • In a one-to-one join, one record in the join table matches one record in the target table

12

13 of 47

Many-to-one joins

Source: Esri

  • In a many-to-one cardinality, one record in the join table matches many records in the target table
  • The join record is repeated as many times as needed in order to fill out the target records

13

14 of 47

One-to-many joins

Source: Esri

  • In a one-to-many cardinality, many records in the join table match one record in the target table
  • It cannot be displayed as a single table, because there are multiple records matched to each target record
  • A relate is used instead, in which the tables remain separate but are linked together, so that a selection in one table can be used to select the linked records in the other table

14

15 of 47

Many-to-many cardinality

In a many-to-many cardinality, multiple records in one table match multiple records in another table

  • Each student takes many classes, and each class has many students.

These complex relationships are rarely satisfactorily treated in G I S, although a relate is one way to handle them

15

16 of 47

Rule of Joining

Source: Esri

  • Each record in the target table must match one and only one record in the join table
  • This rule is satisfied for one-to-one and many-to-one cardinality, but not for a one-to-many cardinality

16

17 of 47

Unmatched records in joins

Source: Esri

Sometimes a target record has no match in the join table

  • The earthquakes table has records for Puerto Rico, but the states table does not.

Records without a match will show <Null> values in the fields from the join table

If all of the join fields have <Null> values, the join may have been performed incorrectly, resulting in no matches

17

18 of 47

Statistics on tables

18

19 of 47

Thinking critically with statistics

Source: Esri

Exploring data with statistics helps you think critically about your data and identify potential issues

  • Zero magnitude earthquakes represent missing values and will impact statistics calculated for the quakes.
  • Actual earthquake distributions have many more low magnitude quakes than large ones. This data set has been screened to show only the large quakes.

19

20 of 47

Charts

Source: Esri

Scatterplots are another great data exploration tool

This scatterplot shows the strong correlation between average temperature and latitude

Pro charts are linked to the map

  • Warm or cold outliers selected on the graph will also be selected on the map.

20

21 of 47

Field Types

21

22 of 47

Creating fields

Fields have specific types that can be created

Each field must be defined before placing any data in it

Once defined, the field type cannot be changed

Fields have stringent naming rules to ensure that they work with many different types of databases

  • No more than 13 characters.
  • Use only letters and numbers.
  • Must start with a letter.

22

23 of 47

Binary data

  • Computers store data as binary numbers in base 2
  • Base 2 numbers are sequences of 0’s and 1’s
  • Each binary digit is called a bit
  • Bits are grouped into sets of 8 to form a byte, which can store numbers from 0 to 255
  • More bytes are added to store larger numbers
  • A megabyte (M B) is 1 million bytes.

23

24 of 47

A S C I I storage

American Standard Code for Information Interchange (A S C I I) is how computers store text

Each letter, character, and symbol is represented by a unique 7-bit binary code

  • For example, the letter C is represented by 1000011 in binary (or 67 in decimal).
  • Upper and lower case letters have different A S C I I codes.

C A T = {67,65,84} decimal = 100001110000011010100

cat = {99,97,116} decimal = 110001111000011110100

148 = {49,52,56} decimal = 011000101101000111000

24

25 of 47

Storing data

  • Text data are always stored in A S C I I format.
  • Numeric data may be stored in A S C I I or binary format.
  • Binary is generally more efficient for numbers.

A S C I I stores “106” as three codes of 1 byte each = 3 bytes

106 = {49,48,54} decimal = 011000101100000110110

Binary stores “106” as a single 1-byte binary number

106 = 01101010

25

26 of 47

Byte storage limits

A single byte can store a value from 0 to

Larger numbers require more bytes

Signed numbers require a bit to store positive or negative, so storage limits are smaller

26

27 of 47

Integer versus float storage

Scientific notation

Binary stores whole numbers (integers)

To store decimal values, the computer stores a form of scientific notation with a mantissa and an exponent

  • 3.2957239e04 = 32957.239.
  • − 3.2957239e04 = − 32957.239.
  • 3.2957239e − 04 = 0.00032957239.

27

28 of 47

Float precision

Large numbers start to lose precision because the number of significant digits in the mantissa is limited.

  • 3.2957239e12 = 3295723900000.

A double-precision floating point allots more storage to the mantissa value

  • 3.295723956249723e12 = 3295723956249.723.

28

29 of 47

Database storage

Database fields are typically defined by

  • A S C I I versus binary type storage.
  • Bytes of storage allocated.
  • Integer versus floating point.

The definition limits the values that can be stored

  • It is important to match the type to storage requirements.
  • Try to minimize storage space while making sure all potential values will fit in the field.

Text (A S C I I) field with 10 bytes “Mississipp”

Binary 2-byte signed integer: −32,767 to +32,767

Single-precision floating point x.xxxxxxxeyy

29

30 of 47

Arc G I S field data types

Pro has six basic field types: short and long for integers, float and double for decimal values, text, and date

30

Field type

Explanation

Examples

Short

Integers stored as 2-byte binary numbers

Range of values −32,000 to +32,000

255

12001

Long

Integers stored as 10-byte binary numbers

Range of values −2.14 billion to +2.14 billion

156000

457890

Float

Floating-point values with eight significant digits in the mantissa

1.289385e12

1.5647894e − 02

Double

Double-precision floating-point values with 16 significant digits in the mantissa

1.12114118119141e13

Text

Alphanumeric strings

‘Maple St’

‘John H. Smith’

Date

Date/time format for calendar dates and times

07/12/2008

10/17/1963 13:24:06

31 of 47

Geometry fields

Source: Esri

  • Geodatabase attribute tables have special geometry fields called Shape_Length and Shape_Area
  • These fields are updated automatically if the feature shape changes, as during a generalize (a) or dissolve (b−c) operation
  • All user-defined geometry fields must be updated manually
  • Shapefiles have no automatic geometry fields, so all must be updated manually with the Calculate Attribute Geometry tool

31

32 of 47

Domains

32

33 of 47

Domains

Source: Esri

Domains are rules about what can be placed in a field

Coded domains provide lists of values to pick from

  • Choose Conifer or Deciduous instead of typing it each time.
  • A shorter code can be stored, saving space, while the user gets an understandable term to select.

Range domains specify the range of numeric values permitted

  • A percent field should have values from 0 to 100.

Domains are especially useful when editing or when collecting data in the field on a G P S or mobile device

33

34 of 47

Examples of domains

  • This table shows coded domains for collecting features that one might find in a park or on a campus
  • Coded domains save time and reduce typing and data entry errors

34

Domain name

Field type

Domain type

Coded values

Things

Text

Coded

Tree, Bench, Lightpole, Sign, Sculpture, Other

ConditionClass

Text

Coded

Good, Fair, Poor

LineCategories

Text

Coded

Sidewalk, Street, Trail, Powerline, Other

SurfaceMaterials

Text

Coded

Concrete, Asphalt, Gravel, Dirt, Grass, Matting, Bark, Sand, Other

LanduseClass

Text

Coded

Parking, Athletic field, Pool, Landscaping, Playground, Game court, General use, Other

35 of 47

Creating domains

Source: Esri

  • Domains are created for a geodatabase rather than a feature class, so they can be reused for many fields
  • Domains are given field types that must match the fields to which they are assigned
  • This SymbolType coded value domain is for a text field, and two codes for solid and dashed lines have been created
  • Split and merge policies control the behavior of the fields if they are edited

35

36 of 47

Assigning domains

Source: Esri

Once created, the domain must be assigned to the field using the Fields view

    • This ParcelType coded domain is being assigned to the Bldgtype field in this Buildings feature class.

The field type must match (Short, Float, and so on)

Domains can be reused for many fields if desired

    • True/False, or Yes/No, or, Good/Fair/Poor are examples of often-reused coded domains.

36

37 of 47

Schema

Source: Esri

The structure of a table, including its fields, their definitions, its domains, and so on, are called the table schema

    • This buildings feature class schema has four text fields of various widths, one domain, and two default values.

A schema can be saved without any data in it and used as a template to create another empty table with the identical structure

37

38 of 47

Tables in ArcGIS

38

39 of 47

Table views

Source: Esri

Tables are displayed and manipulated in a Table view

    • Rows are called records, and columns are called fields.
    • Right-clicking a field name opens a menu of actions.
    • The Table Options button provides additional functions.
    • The Status bar can be used to view selections.

Table views can be sized and docked like other views

39

40 of 47

Table view properties

Like layers, table views have properties that affect how the tables are viewed and displayed

The settings are cosmetic and do not affect the source table

Some common table view settings include

    • Aliases for field names to make them more understandable.
    • Formatting of fields for consistent decimals or significant figures.
    • Displaying only some of the fields.
    • Changing the field order.

Table view properties are set in the Fields view

40

41 of 47

Fields view

The Fields view is used to view and modify the properties of tables

    • In the default layer mode, shown as U S States, the changes are to the table view and are only cosmetic.
    • If you switch the Fields view to Data Source mode, the changes affect the source table and are permanent.

It is critical to be aware of what types of changes are being made to fields when using Fields view

    • In Data Source view, renaming this WHITE field is a permanent change.

Source: Esri

41

42 of 47

Types of changes in Fields view

Cosmetic edits affecting the table view properties

    • Displaying a field.
    • Setting read-only status.
    • Changing the display width.
    • Formatting numeric options.
    • Highlighting a field.
    • Changing an alias (in one table view).

Permanent edits affecting the source table

    • Adding/deleting a field.
    • Changing a field name.
    • Changing the defined storage width.
    • Creating domains.
    • Creating subtypes.
    • Changing an alias (in all maps, table views, and projects).

42

43 of 47

Editing fields

Source: Esri

  • To edit a field value, click in the value box and type a new value
  • Be careful not to change fields accidently
  • The List By Editing panel of the Contents page can be used to disable editing of a layer and avoid accidental changes

43

44 of 47

Calculating fields

Source: Esri

The Calculate Field tool can be used to enter an expression to calculate new values for a field in a table

    • Existing fields can be used in the expression.
    • Many numeric and string functions are also available.
    • Snippets of Python code can be used for more complex calculations.

44

45 of 47

Importing Excel data

Source: Esri

Excel worksheets can be read directly by Pro, provided that certain formatting requirements are met

    • The first row must contain legal field names.
    • Every column contains one type of data.
      • Don’t use n/a or x to indicate missing values in a number column.
    • There are no formulas, blank lines, or merged cells.
    • A dummy row may help the data type be determined correctly as text, integers, or floating point values.

NAME, NUM, STREET, CITY, STATE, ZIP, PERCENT

ddd, 999, 999, ddd, ddd, 99.99

Smith, 527, W. Main St., Peoria, Il, 45098, 38.3

A comma-delimited C S V file may work better in some cases

45

46 of 47

Importing text

Source: Microsoft, National Climatic Data Center

Text tables may come in several formats

a) tab-delimited file

b) comma-delimited file (C S V)

c) fixed-column file

Pro cannot read the fixed-column format, but Excel can be used to convert one to a C S V

46

47 of 47

47