Reading Time: 4 minutes It’s old news that unstructured data is growing at a substantial clip compared to any other data type.
It’s what we all do now – create docs. Make a copy and change it. Scan files. Take pics. Video anything. Track everything. Then, make lots (and lots) of copies just in case we ever need that doc again.
No matter what your file system structure looks like, people will just tend to dump files where it’s convenient, make copies, new versions, email them to one another, make a new version from the emailed one, save it, tweak it, start afresh.
That’s not just happening within a single traditional local NAS system, but across your entire enterprise.
How many times do you think that you are storing a version of the same file? Backing it up? Remote replicating it? The answers to those questions frequently come as a shock, as does the cost of simply storing that file.
Take a powerpoint customer presentation for example.
The original presentation weighs in at about 10MB in size. It’s an excellent presentation, so your entire sales team adopts it. Every time they present it to a prospective customer, they take a copy and amend some details.
Every time, that copy is 95% – 98% identical to the original (and all other copies). Perhaps the customer name and logo is swapped. Some pricing may vary. The requirements slide might be different.
However, the result is a multitude of very, very similar presentations, all occupying 10MB or so of storage space.
The IT team backs them all up daily, and stores those backups. They make more copies to store offsite.
IDC’s 2020 estimate of the impact of these kinds of replication is 1:9. That’s one original to 9 replications.
That means your 10MB original presentation is effectively occupying 100MB+ in total storage.
That Doesn’t Just Create a Storage Problem
That’s expensive, but far more impactful is the data management problem it creates.
Which file version are your people working from? How do you track that file’s evolution? How many versions have you ended up with? Is anyone using them any longer?
Do those files contain sensitive data that only a restricted group of users should access? If so, how do you manage that, and what happens if files get moved to a directory with more open permissions?
The stats show that 90% of unstructured data goes completely unused after a year. And yet, this data must have been important – often critically so – to your company at some point.
Beyond Data Storage to Data Value
People will behave as people do – they’ll create and save files anywhere and everywhere. You won’t stop that, and neither will we.
But we can empower you to tame your data… to stop the out-of-control replication and multitude of versions of files that give you such a bad data management headache.
CloudFS can globally de-duplicate so that across all of your sites, you aren’t storing the same file, or version thereof, over and over and over again. If you’re running that 1:9 ratio, there’s a storage savings of around 90%.
Since CloudFS is immutable, you don’t have to back your data up. And if you are using an object store that replicates 3 ways (think AWS S3, Google GCS, Azure BLOB), then you don’t need to remote replicate it.
But we give you the option to “cloud mirror” in case you don’t trust them to keep your data available whenever you need it.
With data feeling more under control, let’s talk about herding those user cats.
Findability
Using the bat365 Data Services data management platform, you can search across CloudFS and any third party SMB or NFS system. Data Services search allows you to find files, and in the case of the third party system, see how many times you are storing them, how old they are and when they were last accessed.
Visibility
Data Services’ audit function for CloudFS goes well beyond search, allowing the administrators to see access, copy and move actions and more.
Think about a use case involving GDPR – how painful it would be if you were caught copying PII data out of EMEA to the US?
Data Services can alert the admin, create an audit trail of who looked at the file, copied it, or moved it.
Armed with this information, you can self-report and show that while a technical breach occurred, no harm was done and this data leakage didn’t actually breach the spirit of the law.
Depending on your location and industry, there are a number of use cases for a substantial audit log for security, regulatory, legal hold, and many others.
Early Warning
No data management solution would be complete without using its ability to track file actions in real time and recognize anomalous behavior.
CloudFS is immune to ransomware, so the file system itself cannot be affected…but an early head's up of ransomware activity can save valuable time spent identifying affected files and recovering using snapshots.
Watch this space – Data Services has some exciting new abilities coming very soon.
Shift the balance of power in the fight against ransomware.
Storage Stops at Space. Data Management Delivers Value.
When your focus is on data storage, it can't be on data value. Profitability in storage is linked to preserving data volumes, which is the very challenge every organization on the face of the planet is wrestling with.
With incredible continued growth in unstructured data, there can never be enough storage space – only tough decisions about which data to keep, and for how long.
Switching your focus from data storage to data management means widening your lens to protecting and driving value from an asset, rather than consuming a commodity and deciding which data makes the grade.
This in turn drives you to find ways to control unnecessary data growth, make it quick and easy to find whatever you and your team need, detect and respond to threats, observe how your people use your data and use those observations to develop workflows that predict where data will be needed, and by whom, achieve regulatory compliance and drive high performance environments by empowering collaboration where it works.