Running Shell Commands from Python

dylan hudson
3 min readNov 4, 2021

Using Python with your shell is great combination. Complex conditionals and logic flow are much simpler to write in Python, but it can be slooowww for processing large amounts of data. For many applications, shell builtins will likely perform much faster. To take advantage of both these strengths, we can write the overall control logic with Python, calling shell commands when necessary, without exiting our script. Using Python also provides an easy way to integrate other software and platforms, like AWS, MongoDB, REST endpoints, etc.

You will need two main Python libraries to work with the shell: os and subprocess. Technically, you can get away with just os, but getting in the habit of using subprocess will save you a lot of frustrating debugging in the future.

Moving Around the Filesystem

The basic commands you’ll need to start are:
os.getcwd() — get the current working directory of your script.
os.chdir(“my_file_path/”) — change the script’s working directory to the path passed in.
os.listdir(“my_directory/”) — returns a list of the folders and files in the specified directory (really useful for loops and automation)

It’s important to stay organized when you’re working with multiple directories. I suggest using variables to hold your file paths to act as a single point of truth to make debugging clearer and code changes faster. It’s also a good idea to be as explicit as possible with file paths, so you don’t accidentally end up in a subdirectory you didn’t intend to.
Another good habit to adopt is the ‘leave things the way you found them’ principle: for any function you write that changes the working directory, change it back before the function returns. Theoretically, always being explicit and deliberate should accomplish the same goal, but there’s no downside in having automatic double-checking here.

Calling Commands with Subprocess

The best way to call shell commands is with the “list” syntax that the subprocess library accepts- this leads to a minimum of escape character mistakes by handling all the escaping for you. Just create a Python list of each argument you would use on the shell, using a new element each time you would use a space to separate strings on the shell. For example:
mv myfile.txt myfolder/ on the shell becomes ['mv', 'myfile.txt', 'myfolder/']
and we can pass that to subprocess.call(['mv', 'myfile.txt', 'myfolder/'])
This includes flags, e.g. subprocess.call(['sed', '-i', 's/.//g', 'myfile.txt']) Note that we don’t need to use any additional quotes around the sed command parameters (where you would normally add quotes on the shell). (Extra note: this example sed command assumes Gnu sed; you may run in to issues on OSX.)
When in doubt, or if complex commands are causing issues, try using shlex to generate the list from the exact string you would input to the shell-
import shlex
shell_string = “cp myfile.txt myfolder/”
command_list = shlex.split(shell_string)
subprocess.call(command_list)

I/O Buffers and Spawning Processes

Subprocess does support passing input and capturing output from stdin/stdout/stderr, etc, and spawning new processes to allow your script to keep running concurrently. We won’t cover use cases or syntax details in this article, but you can read the docs here, and hopefully I’ll get a post about when and why you’d need to use those features up soon.

P.S. — What about custom functions and aliases in bash?

I love custom bash commands and aliases as much as the next person, but it can get gnarly trying to execute them from Python, especially if you have to parse multiple command-line arguments and/or deal with different internal field separators. It also makes your script less portable, so I’d advise keeping it as foundational and one-command-one-task as possible. Let Python do the work of flow control and abstraction.

--

--