In this article I have included a link to a Python script I wrote that can serve some document control and record-keeping purposes.
Let’s say that for record-keeping or administrative reasons, you have to download a zipped file from the web on a regular basis in order to ensure you have the up-to-date version of the file when you need it. You then have to unzip the file and put it into a folder named after the file. If that folder doesn’t exist, you must create one. If the folder already exists and there’s already an unzipped file in the folder with the same file name as the unzipped file, you want to replace that file with the up-to-date file you just obtained. What’s more, the unzipped files you download could start with any letter from ‘a’ to ‘z’. For simplicity, let’s assume that in this case the unzipped files either starts with ‘f’ or ‘a’.
While this sounds like a simple task to do manually. Imagine you have to download a large number of zipped files around the same time. Ten files might still be manageable, but what if it gets to 30 or 40 files? It will definitely cause huge headache and you know that’s not very efficient.
Fortunately, once you’ve downloaded all the unzipped files, there’s a Python script you could run to achieve all that I mentioned above, and it’s based on only 4 Python packages: os, glob, shutil and zipfile. The link to the Python script can be accessed via: Document_control_script. As you could see from the script, line 6 contains the directory where the downloaded files are stored. If the folder that contains the unzipped file name does not exists yet, the script will automatically create one. You could also see from comments that line 11 to 30 deals with unzipped file names that start with ‘f’, whereas line 33 to 51 deals with unzipped file names that start with ‘a’. If you have file names that start with other letters, you could copy line 11 to 30 (or alternatively, line 33 to 51) and paste them underneath line 51, and then change the ‘f’ or ‘a’ to the letter you are interested in. You could make the script even more efficient by simply creating a for loop, a class or even defining a simple function to account for all the letters you want in one place. For instance, if you don’t know what the leading letters of the files will be, you could create a variable that puts a regular expression into a bracket which contains any of the 26 letters. You could then use a for loop to loop through the letters within the bracket and unzip the file and assign them to the appropriate folders if the leading letter of the file matches the letter within the bracket.
Hope this short tutorial helps! 🙂