Mon-Fri 9AM-5PM (Alaska)

How much data is enough data?

Short answer: it depends on the application and how the data is going to be used.  However, in many ways it boils down to context…

For instance, in relation to climate change, if we look at global mean air temperature, it would be very difficult to assess any level of change over time if we did not have a collection of data in a time-series.

In the realm of climactic change, the temperature data that we have is rather limited, as it has only been recorded consistently around the globe, starting at the turn of the 20th century.  Prior to more modern and standardized data records, we are forced to rely on anecdotal evidence and oral histories, as well as proxy data such as ice cores, tree rings and other physical phenomenon that record some attributes of past climates in their composition and morphology.

Without some temporal context we are unable to validate any change over time if we cannot compare what we are seeing now to some record of what was happening in the past.

Best practices and instrumentation guidelines

The climate change example should be a warning to anyone collecting data now.   Think of the resources saved, lives improved and power for positive change, we as a society, would possess if we had a clear-cut record of all meteorological records extending throughout time.  There would be much less guesswork in relation to how the climate functioned with and without anthropogenic influences on that climate system.

Sounds good right?  So, along those lines, to make any data collection effort worthwhile, one should automatically ask these questions:

Will the data I collect be sensitive to changes that I am trying to observe?

It is important to consider WHY you are collecting data and WHAT the final use of that data will be.  Whether data collection is for operational monitoring or being collected to drive a modeling effort, you might ask yourself, will your data effort sample frequently enough and long enough to detect:

  • Diurnal changes?
  • Seasonal changes?
  • Interannual variability?
  • Land use and land cover change?

Or, if we use a more specific example, such as Ice Roads in Northern Alaska, we might ask more specific questions:

  • What amount of instrumentation coverage do you need to have adequate spatial representation?
  • Should one consider adding more instrumentation for redundancy?
  • What if a station is damaged by equipment or animals?
  • Does my sampling frequency supply enough temporal resolution?
  • How would one know if pre-packing the road is effective or not?

How important is this data and what can I do to make sure my data acquisition efforts are not made in vain?

Equipment should be well suited to the environment where it needs to function. 

For example, check the intended use, operational temperatures, and measurement ranges on a given sensor.

  • Is it possible that you will exceed those limitations?
  • What objective hazards might your installation be subjected to and how can you mitigate the risk of losing data or losing equipment?
  • What different technologies are available to measure the same parameter (some are better than others)?

Data collection is only as good as the installation.

  • In what ways have others approached yours or similar challenges in the past?
  • Is your sensor of choice accurately interacting with the medium that you’re attempting to monitor?
  • Perhaps make a list of installation challenges and how you might manage each one.

Gather more data and employ redundancy in your collection system to make sure you don’t lose any of it.

Often times, if you’re going to invest in gathering a few data points over a long period of time, you can collect much more data to achieve better context and reap the benefits with relatively little additional cost.  Further, comprehensive data collection means that you incorporate Data Telemetry, as well as log data locally in order to ensure success in data collection.

In reality, you can never have too much data—you just need to collect the right data, with the right equipment and organize it appropriately.

Share the word!

More Posts

Just For You