How to Analyze and Clean Up Your File Pile

How to Analyze and Clean Up Your File Pile

By:

After a glance at my office, you might question my authority regarding organizing or cleaning anything. A pile of office decorations has sat in a corner for about five years. Doodles from my kids and system architecture drawings from past training sessions adorn my whiteboard. Five note pads feature sketches and ideas for new Titan CMS features, notes from candidate interviews, and a record of time spent on recent project work. Countless scraps of additional content stuffed into my desk should probably hit the roundfile.

I’ve noticed that quite a few Titan CMS users organize their files the way I organize my office. Fortunately, Titan CMS has some helpful tools for analyzing the file environment and keeping it clean.

 

Why Clean Up?

Yes, I’m about to tell you to go clean your room. But it won’t be so bad, and you’ll thank me later. A clean file system improves author productivity and gives site visitors a more responsive experience. Cleanliness might even reduce infrastructure costs.

Files and folders in Titan CMS are similar to Content sites and Data sites. We affectionately refer to these file sites as File Piles, which are primarily a database construct. Every file and folder is a record in the database. This makes it possible to manage tags, security, workflow, version history, comments, searchability and the other properties common to all content in the Titan CMS Workstation.

As with any figurative pile of stuff, File Piles grow relentlessly. File Piles with large amounts of clutter -- that is, duplicate or unused files -- represent extraneous records in the database. They can lead to:

  • Poor tree performance. While files don’t appear in the tree, they are records that must analyzed based on Workflow security, so that we can show the correct folders in the tree. These security calculations have a direct impact on the performance of the tree.
  • Diluted search results. Content searches return unused content and thus make locating the right file more time-consuming. This is true for both the Workstation and the Display.
  • Inconsistencies in following content standards. Duplicate, obsolete, and deprecated content remains available to authors and editors, who might use them by mistake.
  • Time-consuming content maintenance and troubleshooting. Content duplicated in various places takes more time to update. If everyone uses the same file, an update flows through the site automatically.

Titan CMS files, beyond being records in a database, are digital assets stored on a server. A modest collection of files in Titan requires a larger-than-expected memory footprint on the server, due to management of multiple versions of documents. Images, in particular, are worth watching, as Titan creates multiple copies of an image to support use of optimized, web-ready sizes instead of the bulky originals.

While memory may be cheap, it isn’t free. Cloud storage capacity may be virtually unlimited, but at what cost? Planning for your site file storage needs is an important aspect of maintaining your site, and keeping the file environment clean helps control infrastructure costs.

 

How Much Work Will This Be?

I get it. You have other things to do. But the longer you put it off, the worse it will get. Get organized so you can spend more time on the things that matter.

I recommend using Titan CMS tools and following this basic approach to decluttering your Titan File Piles:

  1. Prepare mentally to delete anything useless.
  2. Analyze your content and determine where files are used.
  3. Delete unused files and normalize file usage to allow for deduping.
  4. Adopt or reform standards for your use of images and documents.
  5. Put together a folder structure that best serves authors in locating files.

 

Just Throw That Away

Titan CMS has this handy feature you might have heard about. It’s called Delete.

Too many people are downright scared of deleting things. They’ve taken those scary confirmation warnings a little too seriously.

 

Whoa! What are you thinking? Are you sure you want to delete this? It’ll be gone forever and you’ll be the one to blame when the whole thing falls apart. You can’t say I didn’t warn you.

Click OK to Delete -- despite my fervent warning!

Click Cancel if you have half a brain and don’t want to lose your job.

 

Titan CMS is somewhat less hysterical about it.

 

 

 

In Titan CMS, Delete really means Recycle. Items sit in our Recycle Bin for a time before they disappear.

This reduces the emotion and tension lurking around the Delete button, so you can be completely rational about using it. Develop criteria for discerning what is useful and what is not. Be okay with letting go of the junk; with Titan CMS, you’re protected. Recycle confidently. If you need that junk later, simply move it back. Not scary at all.

Are we calm, now? OK. Let’s proceed to the first step: analysis.

 

Analyzing “Where Used”

Titan File Piles come with a system-defined view that tells you about file usage throughout the entire system. For example, any given image can be useful within Freeform content, as a Teaser Image in Properties, or linked to a Data Record. The Where Used view gives insight into these common content areas along with User Management, Site Configuration, Smart Search, and Block Copies. The view also provides links to support editing at those locations.

 

 

 

The Where Used view is unique among other views in Titan CMS, as it depends on a thorough indexing of your entire system. The index allows quick location of files. A background job in Titan CMS builds this index. The job can be scheduled to run automatically or manually when you clean up your File Pile.

It is important to update the index from time to time, to account for changed content. The Where Used view includes a convenient message indicating the age of the index and a link to Titan Admin, where you can manually rebuild the content index.

 

 

 

In addition to the default view, users can create their own Where Used views. These can include other columns of useful information. Say you’re reviewing image usage throughout the system; you might also want to see file size and image dimensions. These properties indicate whether images are properly optimized and follow design standards. Creation and last-update dates can indicate the relevance of documents.

Information found in the Where Used view is critical to analysis. Whether you view it in Titan CMS or download it to a spreadsheet, the default Where Used view will show two important things relevant to cleanup.

 

Unused Files

An unused file will have no counts listed in the view; they’re easy to spot.

 

  

 

Titan also has a special dashboard widget, Files Not Used. It becomes very helpful after you’ve built the Where Used index. The widget makes locating unused files even easier.

Of course, not every unused file is trash. Before removing such files, ask these questions:

  • How old are the files?
  • Who initially uploaded them?
  • What was their intended purpose?
  • Are they still useful?

 

Duplicate Files

File names that contain [1] or another bracketed number signal that another file with that name existed in the system at the time it was uploaded. In most cases, that means you have a duplicate. Use the keyword filter to show only items containing a square bracket and you’ll see all the possible duplicates.

 

 

As before, take time to understand why the duplicates exist before deleting them. For example, Titan CMS allows users to crop, rotate and resize images. A presumed duplicate may actually be a new, edited version of a file someone forgot to rename.

Finding unused and duplicate files is, admittedly, a very basic analytical process. I suspect that as you peruse the Where Used view, the creative organizational juices will start flowing. You might already have ideas about new folders and better categories, and that’s excellent.

The next step is to delete and dedupe. Deduping is simply the process of updating content and removing the duplicates, so only the correct version of a file remains.

 

Purge Recycle Bin and Disk Cleanup

At this point in the process, your recycle bin should be chuck full of refuse. In Titan CMS, the content in the recycle bin is managed content. It still exists as database content, complete with tags, security, workflow, past versions, and properties, and as digital assets on a file server.

Permanent deletion is the next step. Titan CMS users can perform it by locating items in the recycle bin and clicking the Delete icon in the toolbar. This raises a slightly more ominous confirmation prompt, and rightly so.

As an alternative to manual delete, Titan CMS has a background job, Database Cleanup. It has a configurable option for automatically purging recycled content, based on duration in the recycle bin. This automates the process and of staying on top of regular cleanup.

After manual deletion or deletion by background job, a final step remains. After a database cleanup, the digital assets remain on the file server. This is our last line of defense against accidental deletes. At this point we could restore a backup of your database and recover everything.

The final step is another background job – DocMgmt File Sweeper. This job runs through all the file assets on your file server and deletes any file not referenced by the Titan CMS database. Depending on the magnitude of files to process, this can run long. Once it completes, you will have reclaimed your disk space.

Before you call it a day, do two more things to further elevate your cleanup.

 

Standard Image Sizes

Titan CMS can automatically create web-ready sizes for images uploaded into the system. Many Titan CMS installations use the four standard sizes – Small, Medium, Large, and Thumbnail. You can easily add more sizes, should a site design call for them.

The example below is from our environment, and perfect for this topic. Notice that several of the named sizes have identical dimensions.

  • Original = BlogBannerT
  • Medium = BlogTeaserN
  • PortfolioTeaser = BlogTeaserT

 

 

 

First, this can confuse an author. Which image size fits which situation? What do these names mean? Second, and more to the topic at hand, Titan CMS is creating additional image files to support these named sizes, when a file with those dimensions already exists. Based on this example, I would assume that three redundant files exist on disk for every large -format image in the system.

Take some time to review the image configuration. Is there room for optimization? If you do change anything, run the Rebuild Image Sizes background job to create all the correct sizes across the File Piles in your environment. Also, run the DocMgmt File Sweeper task, mentioned previously, to clean up all the old images.

 

Whole-Enterprise vs Site-Centric Folders

The organization of your files into folders is critical, and a helpful folder structure will make a world of difference to authors and editors.

Everyone has an opinion about file categories, and everyone likes it their way. For years, we recommended separate File Piles for each website, to avoid intermingling Intranet files with those of the public-facing website. In situations requiring secured file content, this is still a good model, but certainly not the only option.

Recently, especially with the expansion of Data content, I have come to recommend considering file and folder organization from a whole-enterprise perspective. This faces the reality that authors and editors log into a single Titan CMS environment to manage content across all their web properties -- that is, their whole enterprise. Files are useful in any context, not just a single Content or Data site. Removal of unnecessary levels of folders makes it easier for authors to find things and to decide where to put things.

As a simple starting point, consider arranging your files into two new File Piles, one for Images and the other for non-image Resources. Then break each of these down into high-level categories representing their purpose or what they represent, such as Logos, Icons, People, or Banners. At the same time, avoid categories that are too granular or specific, unless they serve a specific audience. Make reducing unnecessary folder levels a goal. Once achieved, everyone will be clearer about what goes where, and barriers that lead to duplicate content will be lowered.

 

It Will Be Worth It

Cleaning up a File Pile is a tough job, but failure to do it drags down the whole organization. Contribute to the greater good. Make life easier on your authors and editors, and satisfy your own hunger for order and efficiency. Go clean your room.

Originally Published: Tue, October 10, 2017
Share with Friends
X
An error (Object reference not set to an instance of an object.) was encountered trying to format content from PageUrl=/SocialMedia.htm