Skip To Content

Perform big data analysis using ArcGIS GeoAnalytics Server

ArcGIS GeoAnalytics Server is a big data processing and analysis capability of ArcGIS Enterprise. It provides a distributed computing framework that powers a collection of analysis tools for analyzing large volumes of data. Through aggregation, regression, detection, clustering, and so on, you can visualize, understand, and act upon your big data. GeoAnalytics Server allows you to gain insights that may otherwise be hidden in your data, such as patterns, trends, and anomalies.

ArcGIS GeoAnalytics Server is a big data processing and analysis capability of ArcGIS Enterprise. It provides a distributed computing framework that powers a collection of analysis tools for analyzing large volumes of data. Through aggregation, regression, detection, clustering, and so on, you can visualize, understand, and act upon your big data. GeoAnalytics Server allows you to gain insights that may otherwise be hidden in your data, such as patterns, trends, and anomalies.

GeoAnalytics tools are versatile across industries. The following examples illustrate how GeoAnalytics Server can be used with different goals in mind:

  • As a crime analyst, you can understand the location and time of crimes in your state, as well as the proximity of crimes to areas of interest such as events, police stations, and city centers. Related tools are Aggregate Points and Join Features.
  • As a manager at a state Department of Transportation, you can analyze decades of traffic and crash data to determine the interstates with the most incidents. You can also analyze when certain vehicles were speeding and breaking, and correlate them with the locations of vehicular accidents. Related tools are Find Point Clusters and Reconstruct Tracks.
  • As an environmental scientist, you can identify times and locations of high ozone levels across the country in a dataset of millions of static sensor reads. Related tools are Detect Incidents and Create Space Time Cube.
  • As an electric utility engineer, you can determine how close lightning strikes were to your electrical lines and substations. Related tools are Create Buffers and Join Features.
  • As a water utility technician, you can sort through work orders for leaks, and join them to a dataset of soil types to determine if leaks have occurred in areas where there is particularly corrosive soil. Related tools are Create Space Time Cube and Find Hot Spots.
  • As a retail lead, you can experiment with realigning your trade areas based on demographics, past sales, or distance to and from a store. You can also see how store performance is similar and dissimilar across your portfolio. Related tools are Dissolve Boundaries and Find Similar Locations.
  • As a city GIS analyst, you can use ArcGIS GeoEvent Server to ingest GPS data on all city vehicles, like public works vehicles and snow plows. See where vehicles have travelled, areas that have with less coverage, and instances where vehicles exceeded the speed limit. Related tools are Reconstruct Tracks, Aggregate Points, and Detect Incidents.

Access the GeoAnalytics Tools

The feature analysis tools from ArcGIS GeoAnalytics Server can be used in Map Viewer, in ArcGIS Pro, the ArcGIS API for Python, and via the ArcGIS REST API. As a portal member, you can access the tools using the steps below.

For information on running the tools through the ArcGIS REST API, see the ArcGIS REST API documentation. To learn more about running the tools in ArcGIS Pro, see the ArcGIS Pro documentation.

Access the tools from Map Viewer

  1. Log in to the portal as a member with GeoAnalytics feature analysis privileges.
  2. Click Map to open Map Viewer.
  3. Click Analysis and choose GeoAnalytics Tools.
Note:

If you do not see the Analysis button or the GeoAnalytics Tools tab in Map Viewer, contact your portal administrator. Your portal may not be configured with ArcGIS GeoAnalytics Server, or you may not have privileges to run the tools. If you do not have the permissions required for the tools, they will not be visible.

Access the tools from the ArcGIS API for Python

The ArcGIS API for Python allows GIS analysts and data scientists to query, visualize, analyze, and transform their spatial data using the powerful GeoAnalytics Tools available in their organization. To learn more about the analysis capabilities of the API, see the documentation site.

The big data analysis tools can be accessed via the geoanalytics module.

Prepare your data for analysis

You can run the GeoAnalytics Tools on the following:

  • Feature layers (hosted, hosted feature layer views, and from feature services)
  • Feature collections
  • Big data file shares registered with ArcGIS GeoAnalytics Server

GeoAnalytics Tools output

The output from running GeoAnalytics Tools can be one of two options:

  • A hosted feature layer with data stored in ArcGIS Data Store registered with the portal's hosting server.
  • A dataset stored to a big data file share (a folder, cloud store, HDFS location) that you have registered with your GeoAnalytics Server.

Tool overview

An overview of each of the tools can be found below. The analysis tools are arranged in categories. These categories are logical groupings and do not affect how you access or use the tools in any way.

Summarize data

These tools calculate total counts, lengths, areas, and basic descriptive statistics of features and their attributes within areas or near other features.

ToolDescription

Aggregate Points

Aggregate Points

Using a layer of point features and either a layer of area features or a distance used to calculate bins, this tool determines which points fall within each area and calculates statistics for all the points within each area. You can optionally apply time slicing to this tool.

The following are example scenarios for using this tool:

  • Given point locations of crime incidents, count the number of crimes per county or other administrative district.
  • Find the highest and lowest monthly revenues for franchise locations using 100 km bins.

Build Multi-Variable Grid

Build Multi-Variable Grid

The Build Multi-Variable Grid tool generates a grid of square or hexagonal bins and calculates variables for each bin based on the proximity of one or more input layers.

the following are example scenarios for using this tool:

  • Given multiple layers of public transportation infrastructure, what part of the city is least accessible by public transportation?
  • Given layers of waterways, such as lakes and rivers, what is the name of the water body closest to each location in the United States?
  • Given a layer of household income, where in the United States is the variation of income in the surrounding 50 miles the greatest?

Describe Dataset

Describe Dataset

Describe Dataset outputs feature sample and extent layers, calculates summary statistics, and outlines input layer properties.

The following are example scenarios for using this tool:

  • Given a dataset of 2 billion features, create a sample layer of 1000 features to efficiently visualize and inspect features in a map. Inspect summary statistics for the full dataset by viewing the output summary statistics table.
  • Given a big data file share dataset composed of 40 individual CSV files, output an extent layer to represent the spatial dispersion of input features without drawing them all on a map. View the output JSON overview to see the spatial reference, geometry type, and record count.

Join Features

Join Features Tool

Use a layer of point, line, or area features or a table and another layer of point, line, or area features or a table to join feature that exhibit a specified relationship. Spatial, temporal, and attribute relationships can be used to join features, and optionally calculate summary statistics.

The following are example scenarios for using this tool:

  • Given point locations of crime incidents with a time, join the crime data to itself, specifying a spatial relationship of crimes within one square kilometer that occurred within one hour of each other, to determine if there is a sequence of crimes that are close to each other in space and time.
  • Given a table of ZIP Codes with demographic information and area features representing residential building, join the demographic information to the residences so that each residence now has the information.

Reconstruct Tracks

Reconstruct Tracks

Using either a layer of point features or polygon features that are time-enabled, this tool determines which input features belong in a track, ordering the inputs sequentially in time. It then calculates statistics about all the input features within each track.

The following is an example scenario for using this tool:

  • Given point locations and time of hurricane measurements, calculate the mean wind speed and maximum wind pressure of the hurricane.

Summarize Attributes

Summarize Attributes

Using either feature or tabular data, this tool summarizes statistics for fields.

The following are example scenarios for using this tool:

  • Given locations of grocery stores with a field called COMPANY_NAME, summarize the stores by the company name to determine statistics for each company.
  • Given a table of grocery stores with fields called COMPANY_NAME and COUNTY, summarize the stores by the company name and county to determine statistics for each company in each county.

Summarize Within

Summarize Within

This tool finds areas (and portions of areas) that overlap between two layers and calculates statistics for the overlap.

The following are example scenarios for using this tool:

  • Given a layer of watershed areas and a layer of land-use areas by land-use type, calculate total acreage of land-use type for each watershed.
  • Given a layer of parcels in a county and a layer of city boundaries, summarize the average value of vacant parcels within each city.

Find locations

These tools find features that pass any number of criteria that you specify. They are typically used for site selection, where the objective is to find places that satisfy multiple criteria.

ToolDescription

Detect Incidents

Detect Incidents

This tool works with a time-enabled layer of points, lines, areas, or tables that represents an instant in time. Using sequentially ordered features, called tracks, this tool determines which features are incidents of interest. Incidents are determined by conditions that you specify.

The following are example scenarios for using this tool:

  • You're given a layer of GPS measurements of hurricanes every 10 minutes. Each GPS measurement records the hurricane's name, location, time of recording, and wind speed. Using these fields, create an incident where any measurement with a wind speed greater than 208 km/h is an incident titled Catastrophic
  • Given a layer of sensor measurements, create an incident whenever values exceed the mean of the three previous values.

Find Similar Locations

Find Similar Locations

Based on criteria you specify, the Find Similar Locations tool measures the similarity of locations in your candidate search layer to one or more reference locations.

The following are example scenarios for using this tool:

  • Find the 10 most similar stores by examining the number of employees and the annual sales.
  • Find the 100 most similar cities by examining the relationship between population, annual growth, and tax revenue.

Geocode Locations from Table

Geocode Locations from Table

This tool converts addresses into coordinates. Use this tool on big data file share tables.

The following are example scenarios for using this tool:

  • Geocode multiple CSVs representing addresses of crime location to explore hotspots of crime.
  • Geocode a text file representing delivery locations for an online retailer to determine where marketing efforts have been most effective.

Data enrichment

These tools help you explore the characteristics of your data. Add information to your input data by enriching it from another data source.

Enrich From Multi-Variable Grid

Enrich From Multi-Variable Grid

This tool joins attributes from a multivariable grid to a point layer, allowing you to quickly add a large and diverse collection of information to point data for use in further spatial analysis.

The following is an example scenario for using this tool:

  • Given a layer containing millions of power outage incidents, enrich the incident features with information about typical usage, environmental risks, and infrastructure conditions to study the relationship between these factors and power outage frequency.

Analyze patterns

These tools help you identify, quantify, and visualize spatial patterns in your data.

ToolDescription

Calculate Density

Calculate Density

The Calculate Density tool creates a density map from point features by spreading known quantities of some phenomenon (represented as attributes of the points) across the map. The result is a layer of areas representing the density.

The following are example scenarios for using this tool:

  • Calculate densities of hospitals within a county. The result layer will show areas with high and low accessibility to hospitals, and this information can be used to decide where new hospitals should be built.
  • Identify areas that are at high risk of forest fires based on historical locations of forest fires.
  • Locate communities that are far from major highways in order to plan where new roads should be constructed.

Find Hot Spots

Find Hot Spots

The Find Hot Spots tool determines if there is any statistically significant clustering in the spatial pattern of your data.

The following are example questions this tool can help you answer:

  • Are your points (crime incidents, trees, traffic accidents) clustered? How can you be sure?
  • Have you discovered a statistically significant hot spot (for spending, infant mortality, consistently high test scores). or would your map tell a different story if you changed the way it was symbolized?

Find Point Clusters

Find Point Clusters

The Find Point Cluster tool finds clusters of point features within surrounding noise based on their spatial distribution.

The following are example scenarios for using this tool:

  • Find clusters of pest infested households to help target eradication efforts.
  • Inform and act on rescue and evacuation needs based on the size and location of the clusters using geo-located tweets following natural hazards or terror attacks .

Forest-based Classification and Regression

Forest-Based Classification and Regression

The Forest-Based Classification and Regression tool models and generates predictions using an adaptation of Leo Breiman's random forest algorithm, which is a supervised machine learning method

The following are example scenarios for using this tool:

  • Given data on the occurrence of seagrass, as well as a number of environmental explanatory variables that have been enriched using a multivariable grid to calculate distances to factories upstream and in major ports, use this tool to predict future seagrass occurrence based on projections for those environmental explanatory variables.
  • Crop yield data has been collected from hundreds of farms across the country along with other attributes at each of those farms (number of employees, acreage, and so on). Using this data, provide a set of features representing farms that don't have crop yield (but do have all the other variables), and make a prediction about crop yield.
  • Housing values can be predicted based on the prices of houses that have been sold in the current year. The sale price of homes sold along with information about the number of bedrooms, distance to schools, proximity to major highways, average income, and crime counts can be used to predict sale prices of similar homes.

Generalized Linear Regression

Clip Layer

The Generalized Linear Regression tool generates predictions or models a dependent variable in terms of its relationship to a set of explanatory variables. This tool can be used to fit continuous (OLS), binary (logistic), and count (Poisson) models.

The following are example questions this tool can help you answer:

  • What demographic characteristics contribute to high rates of public transportation usage?
  • Is there a positive relationship between vandalism and burglary?
  • Which variables effectively predict 911 call volume? Given future projections, what is the expected demand for emergency response resources?
  • What variables affect low birth rates?

Geographically Weighted Regression

Geographically Weighted Regression

The Geographically Weighted Regression (GWR) tool applies a local form of linear regression that is used to model spatially varying relationships.

The following are example questions this tool can help you answer:

  • Is the relationship between educational attainment and income consistent across the study area?
  • What are the key variables that explain high forest fire frequency?
  • Where are the school districts where children are achieving high test scores? What characteristics seem to be associated? Where is each characteristic most important?

Create Space Time Cube

Create Space Time Cube

This tool summarizes a set of time-enabled points into a netCDF structure by aggregating them into space time bins.

The following are example scenarios for using this tool:

  • Aggregate all crimes in a city into 1-km bins by month.
  • Aggregate all 911 calls that occurred in a county over the last 50 years into 100-km bins with annual temporal bins.
Note:

Create Space Time Cube cannot be run through Map Viewer. To use Create Space Time Cube, run the tool through ArcGIS REST API or ArcGIS Pro.

Use proximity

These tools help you answer one of the most common questions posed in spatial analysis: What is near what?

ToolDescription

Create Buffers

Create Buffers

A buffer is an area that covers a given distance from a point, line, or polygon feature.

The following are example scenarios for using this tool:

  • Using linear river features, buffer each river by 50 times the width of the river to determine a proposed riparian boundary.
  • Given areas representing countries, buffer each country by 200 nautical miles to determine the maritime boundary.

Manage data

These tools are used for the day-to-day management of geographic and tabular data.

ToolDescription

Append Data

Append Data

This tool appends point, line, area, or tabular datasets to an existing hosted feature layer of the same geometry type.

The following are example scenarios for using this tool:

  • Given multiple datasets that have been generated monthly, append the datasets to a hosted feature layer to combine the data into an annual report.
  • Given 10 datasets containing climate measurements from various sources, append the datasets to create a single layer of climate measurements. Correct for the schema differences for each source using field mapping.

Calculate Field

Calculate Field

This tool calculates values for a new or existing field and creates a layer in your contents on ArcGIS Enterprise.

The following are example scenarios for using this tool:

  • Modify an existing field named total to be the sum of revenue from the total_2016, total_2017, and total_2018 fields.
  • Create a field to categorize hazard levels based on field values such as windspeed and pollutant.

Clip Layer

Clip Layer

Clip Layer extracts a subset of input layer features from a specified area to create a layer containing the subset.

The following are example scenarios for using this tool:

  • Given a feature layer containing national earthquake occurrence data, use a California state boundary layer to extract only the earthquakes that overlay California.
  • Given a buffer layer that extends 50 feet from the highway, clip the forest features that would be affected by highway expansion.

Copy to Data Store

Copy to Data Store

This tool copies an input feature layer or table to an ArcGIS Data Store and creates a layer in your Web GIS.

The following are example scenarios for using this tool:

  • Copy a collection of CSV files in a big data file share to the spatiotemporal data store for visualization.
  • Copy the features in the current map extent that are stored in the spatiotemporal data store to the relational data store.

Dissolve Boundaries

Dissolve Boundaries

Dissolve Boundaries finds and merges area features that intersect spatially or share the same field value.

The following are example scenarios for using this tool:

  • Given a feature layer of study areas, combine all features that have the same soil type value to create a layer representing areas by soil type.
  • Given restricted areas and buffer zones, dissolve all features together to summarize which locations are closed to development.

Merge Layers

Merge Layers workflow diagram

This tool combines two datasets to create a single output layer. Use merge attributes to determine the resulting schema.

The following are example scenarios for using this tool:

  • Given feature layers for England, Wales, and Scotland, merge the layers to create a single feature layer of Great Britain.
  • Two feature layers represent contiguous townships, each with different field names. Combine the layers using attribute rules to match fields, and output a single layer with the desired schema.

Overlay Layers

Overlay Layers

Overlay Layers combines two or more layers into one single layer.

The following are example questions this tool can help you answer:

  • What parcels are within the 100-year floodplain? (Within is another way of saying on top of.)
  • What land use is on top of what soil type?
  • What wells are within abandoned military bases?