Discovery and Assessment
Imagine landing in Australia, renting a car, and getting on the road without realizing that they drive on the opposite side of the road. Imagine starting to drive with the intention of getting to a desired destination without knowing how to get there or how far you must travel since you understand miles, but distance is listed in kilometers. These are the challenges organizations face when attempting to move data. Most often, they know the destination of their data but not what data they have; they know that the destination is different from the source but not how to make data application compliant in the new environment, and they know how much bandwidth they have but not how many other resources are required to complete the task within predefined time parameters. Even when we think we have the answers, we might still check the GPS for traffic or road closures, check the gas gauge to ensure we have enough gas, and consult a manual when using the car for the first time. In a data mobility scenario, even if you think you know, it is best to perform a thorough assessment. Here is what you will need to know before initiating a data move:
How many files exist, their size and type, time last accessed, and whether they are governed by regulatory compliance requirements. Understanding the data will help make appropriate decisions, such as whether there needs to be a chain of custody report, should orphaned data be moved, and how best to move active files. The speed with which data can be moved depends on the size of the file; the larger the file the fewer transactions required to move it. Time las accessed helps identify static data that may not require resyncing after the move before the cutover.
Knowing the type of data being stored and which applications generated the data may help weed out data that doesn’t represent any value. It is not uncommon for employees to store personal data on their work shares or there may be downloads of conference presentations, white papers, etc., that haven’t been accessed in years and that are available elsewhere if needed in the future. Knowing which applications own the data will dictate how data will be transformed and what updates will be made to the application for the environment to remain application compliant.
Network bandwidth is only one variable in how fast data may be moved. A 1 GbE has the bandwidth to move 125 MB/s or 10.8 TB/day. That is assuming 100% of the bandwidth is available and can be saturated. At this rate (not realistic), it would take over three months to move 1 PB. There are several variables that affect how fast data can be moved. These are: available bandwidth and latency, number of threads, available compute resources on the source and target (IOPS will vary depending on the size of the file; smaller files require larger number of IOPS), storage media performance, and data’s rate of change.
In the age of data hoarding, organizations are in constant need of more storage. There have been numerous innovations that address the need for greater efficiency and lower cost of storage, such as deduplication and compression, lower RPM high density drives, tiering, and erasure coding to lower redundancy overhead. With all these innovations, organizations continue to express concern over their data growth. Even if nothing will be deleted and everything will be moved to the new environment, organizations benefit from knowing at least what they have, who owns it, and whether it has been accessed or modified in recent times. Poorly managed data can benefit from a framework that maps data by the most relevant variable; think of a pivot table.
Interlock Technology’s data mobility methodology starts with discovery and assessment of the data and its current environment. All insights gained from the data scan are used to design a mobility plan. The plan articulates what data may be moved, how much data will be moved, how much time it will take to move the data, and what aspects of data will be transformed to ensure access and application consistency in the new environment. The methodology has been designed and implemented to be a repeatable process, meaning the assessment can be performed on a designated schedule and based on the information gathered, decisions could be made to move data, delete data, or do nothing. The power of insights empowers organizations to make business-oriented decisions that result in greater efficiency and control over data and its storage environment.