Just started with linux mint, need help with a command

skaarl@feddit.nl · edit-2 2 days ago

Just started with linux mint, need help with a command

harsh3466@lemmy.ml · 7 hours ago

Okay, took me awhile to write everything up. The script itself is pretty short, but it’s still much easier to do in a script than to try to make this a one line command.

I tested this creating a top level directory, and then creating three subdirectories inside it with a different number of html files inside those directories, and it worked perfectly. I’m going to break down exactly what’s going on in the script, but do note the two commented commands. I set this script up so you can test it before actually executing it on your files. In the breakdown of the script I’m going to ignore the testing command as if it were not in the script.

The script:

#! /bin/bash

# script to move all html files from a series of directories into a single directory, and rename them as they are moved by adding a numeric indicator
# to the beginning of the filename, while keeping files of the same folder grouped.

fileList="$(find ~/test -name '*.html' | sort)"

num=1

while IFS= read -r line; do

	pad="$(printf '%03d' $num)"

	#The below echo command will test the script to ensure it works. 
	#The output to the terminal will show the mv command for each file.
	#If the results are what you want, you can comment this line out to disable the command, or delete it entirely.

	echo "mv $line ~/done/"$pad${line##*/}""

	#This commented out mv command will actually move and rename all of the files.
	#When you are certain based on the testing that the script will work as desired
	#uncomment this line to allow the command to run and move & rename the files.

	# mv $line ~/done/"$pad${line##*/}"
	((num++))

done<<<"$fileList"

The breakdown of the script is in the reply comment. Lemmy wouldn’t let me post it as one comment.

harsh3466@lemmy.ml · 7 hours ago

The breakdown:

#! /bin/bash - This heads every bash script and is necessary to tell your shell environment what interpreter to use for the script. In this case we’re using /bin/bash to execute the script.

fileList="$(find ~/path/to/dir/with/html/files -name '*.html' | sort)" - What this command is doing is creating a variable called fileList using command substitution. Command substitution encloses a command in "$()" to tell bash to execute the command(s) contained within the substitution group and save the output of the command(s) to the variable. In this case the commands are a find command piped into a sort command.

find ~/path/to/dir/with/html/files -name '*.html' | sort - So this is the command set that will execute and the output of this command will be saved to the variable fileList.

find - invokes the find tool for finding files

~/path/to/dir/with/html/files - This tells find where to start looking for files. You’ll want to change this to the top level directory containing all the subdirectories with the html files.

-name - tells find to match files names using the expression that follows

'*.html' - This is the expression find will use to match files. In this case it’s using globbing to find all files with a file extension of .html. The * glob means to match any number of characters (including no characters at all). So when you combine the glob with the file extension for *.html, you’re telling find to find any files that have any characters at all in the filename as long as that filename ends with .html

| - The pipe redirects the output of the find command, which in this case is a list of files with the full path of those files and to sort them alphanumerically. That sorted list is then saved to the variable fileList

num=1 - Here we’re creating a variable called num with a value of 1. This is for adding sequential numbers to the files as they are moved from their source directory to the destination directory.

while IFS= read -r line; do - This script uses a while loop to process each item saved to the fileList variable. Within the while loop, the moving and renaming of the files will take place.

while - Invokes the while command. What while does is repeat all of the commands contained inside the loop as long as the given condition is true. In this case the condition is “While there are still items in the variable fileList to process, keep processing.” When there are no more items in the fileList variable to process, the condition becomes false and the loop terminates.

IFS= - This calls the Internal Field Separator. The Internal Field Separator is a set of three characters that are used by default to terminate an item. Those default characters are Tab, Space, and Newline. Because the contents of the fileList variable are separated by newline characters, the Internal Field Separator will take the items in the variable one line at a time instead of feeding the entire list into the while loop as one big chunk of text.

read -r - The read command does what it says and reads the input given. In this case our variable fileList (we’ll get to how we make the while loop read the variable below.). The -r flag tells read to ignore the backslash character (if it finds it anywhere) and just treat it as a normal part of the input.

line - this is the variable that each line will be saved in as they are worked through the loop. You can call this whatever you want. I just used line since it’s working through lines of input.

; do - The semicolon terminates the while setup, and do opens the loop for the commands we want to run using the input saved in line.

pad="$(printf '%03d' $num)" - This is another variable being created using command substitution. What the command in the substitution group does is take the num variable and pad it with zeroes to be a three digit number.

printf '%03d' $num - This is the command that runs inside the substitution set.

printf - calls the printf command, which is similar to echo in that it prints output to standard out (the terminal), but printf has more options for manipulating that output.

'%03d' $num - This is a format specifier that you use with printf. The % indicates that what follows is a format specifier. The 0 is the character that’s going to be used in the formatting. You can use any character you want in this position. The 3 indicates the amount of padding, in this case formatting the number to three digits, and the d indicates that what’s being formatted is an integer. The combined format specifier of %03d will then format the argument that follows it. In this case, the variable num.

mv $line /path/to/dest/"$pad${line##*/}" - This command actually moves and renames the file.

mv - Invokes the mv command to move some files

$line this argument is the file to be moved. This being the variable line will be expanded to the full file path of the current html file working through the while loop.

/path/to/dest/"$pad${line##.*/}" - This argument is the destination and renaming of the file being moved. The path part is pretty self explanatory. Replace this with the path to your desired destination. The filename bit needs its own explanation.

"$pad${line##.*/}" - This is the bit that renames the file. What this is doing is concatenating (joining) two different variables to create the new filename. $pad is the formatted num variable to result in a zero padded three digit number that will be added at the beginning of the new filename. ${line##.*/} is the line variable modified using parameter expansion, which is indicated by the curly braces.

Inside the curly braces you have three parts. The first is the parameter, which in this case is the line variable. This is followed by the special characters to modify the expansion, and then following the modification characters is the pattern to be matched.

Without modification, the line variable will look something like: /full/path/to/file1.html.

For the purposes of this mv command, we don’t want the full file path in the destination argument. If the destination argument was /path/to/dest/$pad$line, the expanded result would be /path/to/dest/001/full/path/to/file1.html. That’s obviously no good.

What we want here is /path/to/dest/001file1.html.

To get that, we use the ## modification characters which will take the pattern that follows it, search the parameter for that pattern, find the last occurrence of that pattern, and then delete everything up to that point.

After the ## special characters is the actual pattern to be matched which is .*/. That pattern is a regular expression made up of three characters. The . in a regular expression means to match any character at all. The * after the . dictates how many times the pattern can repeat. As mentioned above the * means to match zero or more times, so combined the .* means literally:

“Match any character at all zero or more times”.

The final character of the regular expression is the literal /. All three characters together (.*/) means:

“Match any character at all zero or more times until you get to the last forward slash character”.

When you combine the ## modification characters with the .*/ pattern like so: ##.*/ it means:

“Find any character at all zero or more times until you get to the last forward slash character, and then delete everything found including the last forward slash character”.

When you put all of that together you have ${file##.*/} which will take /full/path/to/file1.html and output file1.html.

Finally, when you combine the pad variable with it like so: "$pad${file##.*/}" (enclosed in double quotes to insure correct expansion), you get 001file1.html as the new name for the html file as it is moved to the new destination.

((num++)) - Is a nice easy way to increment a number variable in bash

done<<<"$fileList" - Is three parts. done indicates the end of the while loop. <<< is for heredoc, which is a bash utility that allows you to pass a multiline chunk of text into a command. To pass a variable into a command with heredoc you need to use three less than symbols (<<<). Finally, is the variable holding the chunk of text we want fed into the while loop, which is the fileList variable (double quoted to insure proper expansion ignoring spaces and other nonstandard characters).

And that’s the script! Let me know how it works for you.