Back to top

How to bulk delete files responsibly

Linux find command to bulk delete files responsibly

One badly constructed command that intends to delete files from a folder can cost you your entire hard disk. It's not a nice place to be in. But there are steps you can take to mitigate the risks.

Finding the right command that deletes the files is only half the trick. These things are so dangerous you need to add some sensible protection so they can't be misused, no matter how focused and aware you are when executing the command.

This article gathers a comprehensive list of strategies and checks that can help prevent disasters when bulk deleting files. The examples rely on the find command, but the principles can be carried over to other commands.

Backup backup backup

In the IT matters playing defense is always the best starting strategy. Nothing beats a working timely executed backup. If you happen to accidentally burn all the bridges and other important infrastructure in your project, backups are like the reliable hundreds of years old network of tunnels below the surface of anything that can break.

The level of your cautiousness, mindfulness, and confidence is irrelevant!

Schedule regular backups of all your resources, and manually backup the relevant parts before making significant sensitive changes.

Test backups after creating them! Yes, really! It won't do anyone any good to discover after some disaster that backup files are missing or empty.

Find (list) first, delete second

Whether you're constructing the command to be used immediately, or plan on using it at a later date from a cheat-sheet in your project documentation, it's good to first list the files you plan on deleting so you can check the command does include only the files you intend to delete.

With the find command you can do both, first list the files using all the options and patterns you intend to use for deleting, and check if it would indeed delete the correct bunch, then simply add the -delete option at the end of the command to actually delete the constructed list (I've written the super long command across many lines with the help of \ at the end of each piece to indicate the command continues in the next line):

  1. find /path/to/some/folder \ 
  2. -maxdepth 1 \ 
  3. -mtime +3 -mtime -5 \ 
  4. -size +700M -size -900M \ 
  5. \( -name "*.pdf" -o -name "graf*.svg" \ 
  6. -o -name "graf*.png" -o -name \ 
  7. "footer*.html" \) \
  8.  -type f | head -30 

The piped head will limit the list output to a certain number of items, just in case it's a long list.

When ready to delete we can use the same find command with the -delete option at the end:

  1. find /path/to/some/folder \ 
  2. -maxdepth 1 \ 
  3. -mtime +3 -mtime -5 \ 
  4. -size +700M -size -900M \ 
  5. \( -name "*.pdf" -o -name "graf*.svg" \ 
  6. -o -name "graf*.png" -o -name \ 
  7. "footer*.html" \) \ 
  8. -type f -delete 

Limit the location where deletion is to happen

One of the most important things to include in your command is the option for which folder to execute the command in.

  1. find /path/to/some/folder 

It's rarely necessary to delete across the entire HDD so reflecting your true intentions in the command makes a lot of sense and prevents possible future disasters. Be as precise as you can be, avoid the common find . that is context sensitive and depends on your focus and requires super-awareness about your current location. It's easy to lose track of it when you often have to switch between different folders.

Limit the depth to a single folder

When the find command executes in a folder it will include all the subdirectories as well. If your needs are tied to a specific folder and not its child folders it's good to limit the scope using the -maxdepth option:

  1. find /path/to/some/folder -maxdepth 1 

Limit the files' modified time frame

If you don't limit the location where the command is executed and you specify to delete all files older than a certain amount of time you'll still be able to wreak havoc on your system. So this option isn't as liberating as some others, but it can still help define your targets more accurately and prevent the wrong files from being deleted.

This example list all files modified between three and five days ago (leaving out other options here for brevity, don't leave them out IRL):

  1. find -mtime +3 -mtime -5 

Limit the file size

If you know the size of your target files it doesn't hurt to limit by that property as well. This example lists files larger than 700 MB and smaller than 900 MB (use G for GB and k for kB):

  1. find -size +700M -size -900M 

Limit the name pattern

This one is as huge as the location one. Specifying the name patterns you're targeting saves you from starting a non-discriminatory meltdown of system neural pathways. Here's how you can include more than one pattern with wildcards:

  1. find \( -iname "*.pdf" -o -iname "graf*.svg" -o -iname "graf*.png" -o -iname "footer*.html" \) 

The -iname option ignores the text case while the -name respects it. Brackets are there to hold the patterns (as special characters they must be escaped with backslashes).

The -o or -or makes sure that either pattern returns results, otherwise, all of them would have to be true at the same time to return any results.

Limit the target type

Limiting the type of your target may not always affect the results much, but if you happen to have a case where you aim to delete some files and the name pattern matches an important directory you'll be sorry you omitted it.

type -f options targets files only, type -d targets folders (directories).

Mind the options order!

Bulk deleting files for impulsive dummies #Linux #sysadmin Tweet this

All this work matching the patterns and specifying the list to be deleted won't mean one bit if you get the options order wrong. Always include the -delete option last, otherwise if you include it before some patterns and limitations they simply won't be taken into account and everything matching the option before the -delete will be gone. Be careful about the -maxdepth and other options as well.

Do you need help with setting this up? Or with any other Linux problem? Contact me and I'll summon all my skills and might to solve it for you.
What did you think of this article?