How To: Python Application Structure

Anyone that ever wrote any larger python application probably stumbled upon the same problem/question. How to structure python application using multiple files and classes, and still be able to code fast with any modern IDE? How to avoid cyclic imports and help IDE to specify variable type?

Here is an example:
Let’s say we decide to split our currently very long file to separate files in subdirectories. We would probably end up with architecture similar to this:

  - main.py # main application script
  - /sources:
    - source.py # sound worker engine
    - ...
  - /common:
    - common.py # math function
    - ...

As a proper, clean code is written, each file and set of functions are packed into classes. For example, source.py could contain class SoundEngine(), and everything regarding sound engine would fit into that class. common.py could contain class MathFunctions() and class FourierOperations().

The Problem

Let’s say our FourierOperations() needs some setup (
like numberOfSamples), which is initalised at the class creation and must be kept local for this specific class instance. Inside class there would probably be something like self.numberOfSamples storing this value.
We create our class instance inside main.py:

# main.py
import common.common as common
import sources.source as source

class MainApp():
    def __init__(self):
        numberOfSamples = 100
        self.fourierOps = common.FourierOperations(numberOfSamples)  

Now, what if SoundEngine() from source.pyclass needs exactly this FourierOperations class instance (doesn’t really matter why for this example) to perform some magic in the background? Let’s create it and pass it as an argument.

# main.py, parent class
  numberOfSamples = 100
  self.fourierOps = common.FourierOperations(numberOfSamples)
  self.soundEngine = source.SoundEngine(self.fourierOps)

Good. Now we navigate to our SoundEngine() class, and just wish to start code.

# source.py
class SoundEngine():
    def __init__(self, fourierOperations):
        self.fourierOps = fourierOperations
        ...
        self.fourierOps.???

Here is the problem. Humans can barely remember what we eat yesterday for lunch – that is why we use IDEs as VS Code and their autocomplete functionalities to help us code faster with recommendations. We would expect something like this to from a recommendation engine:

In a simple designs, where all code is in a single file, or there is only one class instance created and argument is passed, modern IDEs can assume what attributes fourierOps holds. But, if there is a more complex app structure, IDE sometimes just can’t figure out that type of this argument is, and therefore doesn’t recommend anything. Which is bad, we don’t like to code that way – we want IDE to do that for us.
There is a way to help IDE and specify what type of variable this is, with a simple syntax. Note that import statement must also be added.

# source.py
import common.common as common

class SoundEngine():
    def __init__(self, fourierOperations: common.FourierOperations):
        self.fourierOps = fourierOperations

And voila, editor can now show us all its variables and methods.
Now, what if I wish to pass SoundEngine instance to some other class inside common.py, for example, a class that would log all SoundEngine instance attributes? Let’s keep the same concept as before, specify type so we will be able to write code as a modern developers do.

# common.py
import sources.source as source # added for soundEngine: source.SoundEngine

class FourierOperations():
    def __init__(self, numberOfSamples):
        self.numberOfSamples = numberOfSamples

class LogSoundEngineAttributes():
    def __init__(self, soundEngine: source.SoundEngine):
        self.soundEngine = soundEngine

Nice. Now, although IDEs might be fine with this solution, python will not:

Why not? Let’s debug it. Stepping through code gives us this code tree:

main.py:
  # import common.common as common # 1st time
  > common/__init__.py
  > common/common.py
    # import sources.source as source
    > sources/__init__.py
    > sources/source.py
      # import common.common as common # # 2nd time
      # Note: common.py already imported previously, but classes are not declared since we branch to importing sources before. Debugger just skips it. 
     # class SoundEngine(): 
     #     def __init__(self, fourierOperations: common.FourierOperations): # throws error above

… which makes sense. common.py is firstly imported in main, but it branch to importing source.py right away, and than branch back to importing common.py. Although python can handle cylic imports, there should be no init code using any relative import declarations (in this case common.FourierOperations() is not yet declared at the time SoundEngine() is initialised in __init__(). Python 3 handle cyclic imports, but it can not do magic – making up object declarations which are not already available. See more about cyclic imports.

I hope you get the idea of the problem. If we wish to have a working application and get the best out of IDE, here are few recommendations you should consider while creating application framework.

Conclusion

  • Split code to smaller parts, multiple files, multiple classes as above. This is a good practice and sometimes, at larger projects, this is the only way to go anyway.
  • Think twice where a block of code should be placed in first place. In the example above, LogSoundEngineAttributes should be placed in one of source files. So, a general rule should be:
    • In common folder/scripts, include only stuff that is really common, like loggers, basic utility functions, … Scripts inside common folder should not include any other project scripts (like sources.py).
    • In sources folders/scripts, you can include any script from
      common
      folder or its specific class. Also, you can include sourceOne.py to sourceTwo.py.
  • Avoid passing class instance as arguments if possible. Use data classes defined in scripts from common folder to pass data around the code.
  • Learn about the class inheritance. Many times there is actually no need to pass ahead this specific class instance.
  • Learn about cyclic imports and how modules are imported.
  • Simplify code by not using classes if not necessary. Logging is such example, where it can be set to log from multiple modules with only import statement and logger intialization inside it.

The best solution?

If you use latest Python 3.7.x, there is a nice feature already available. According to PEP 563, enabling from __future__ import annotations disable type checking at initialization.
Which means, you can cross-import modules and interpreter would be fine with it. Stick to the rules above, so your code is clean and readable as possible, but unharmingly help IDE with specifying variable types.

Great read: https://realpython.com/python-type-checking/

Leave a Reply