Debugging in computer programming is defined as a process which involves a step-by-step approach to isolate problem area(s) and then determine a way to either fix them or come up with a workaround. A very common approach which is adopted by developers when debugging is print statements. Lesser known as the “poor man’s debugging technique”, this is quite useful since you can print everything under the hood to get a deeper dive on the variables while you try and isolate the problematic line or a section of code. This also works across all programming languages and does not require the knowledge of any other tool beforehand. It just gets you started!
Programmers often run into bugs that originate from their assumptions around data or an algorithm. Sometimes, bugs get caught easily due to their intuitiveness. However, in other cases programmers often end up spending more time to identify the problem area in the code. Python does have its fair share of tools when it comes to debugging. IDEs such as PyCharm take it a step further in providing a very interactive and powerful graphical debugger (pydev) with breakpoints, thread-wise function stack traces, watch variables, expression evaluation and so on. In-spite of such highly productive debugging tools & techniques being available, these are unusable under certain circumstances. One such example would be debugging on a command line only staging or a pre-production server.
This blog explores some techniques learnt which might quicken up the process of finding bugs in a staging or pre-production environment in the absence of more preferred debuggers. In some specific cases, these methods might work even quicker in general, helping you narrow down problematic code faster. We will be working through some code examples.
Disclaimer: Please bear in mind the examples contain sample code with the purpose of demonstrating the use case only, they will contain assumptions which you may be tempted to criticize.
We have been asked to get weather information from a public API. We come up with the source code below:
We run the code and come up with the exception below:
At line #17, we have messed up concatenating string & int objects as you would have guessed. This is often the case when we are dealing with data from an external source such as a public API. It is evident from the stack trace, “woeid” in the dictionary location_response is an integer. Let’s say, we wanted to investigate further and ensure that the data type returned here is an int.
You could write a print statement and print the type of the variable. This would involve more typing and the overhead of remembering & removing the line as well! However, let’s fire up pdb & see how we can quickly get this done!
On the command line, execute:
> python -m pdb get_weather_state.py
pdb will load the python script and will wait for further instructions. Type “c” (continue) and hit enter to continue code execution:
This time along with the exception stack trace, we can see a (Pdb) prompt! We can investigate the value of variables using the instruction “p” (stands for print) followed by the name of the variable.
In pdb, we printed the value as well as the type of the variable location_response[“woeid”]. This confirms that location_response[“woeid”] is indeed an integer and so we have to convert it into a string and then concatenate. You can now type “q” (quit) and exit the debugger. Program execution will stop at this point.
Below is the snippet of the correction & output of the corrected code.
> python get_weather_state.py Heavy Rain
Keeping things ready
As programmers. we often come across that moment when there is an issue in the production environment where a rouge input is causing a bug in the program. Consequently, we quickly start executing the same flow in a pre-staging / staging environment to try and reproduce the issue & trace the bug.
Let’s assume we modified the previous code and introduced some exception handling to account for some errors.
However, we have avoided printing the full stack trace in the catch-all exception (except Exception) block to save on the size of logs getting generated.
Now when this method is executed in the production environment, we would observe the text “Some unknown error occurred” in the logs. In this situation, we would typically try and reproduce this issue using the same input in our local or staging environment. We can try the below options:
- Introduce some print statements in relevant areas of the code and execute
- Fire up pdb, put a breakpoint and then start the debugging process
What if we could have saved some precious seconds if we had written the below code in the first place?
In this case, we have tried to do two things in the except Exception block:
- Obtain the gettrace attribute from the sys module. In case gettrace is defined as a method, that means a debugger is running. However, pdb has an optimization that unless a breakpoint is set, gettrace would not be defined.
- To account for other debuggers (e.g. pdb) which optimize on the basis of the point mentioned above, we try & check if the pdb module is in use, using sys.modules
Hence, if the variable is_debug is True, we print the full stack trace otherwise we just print the error message which we were printing earlier. We assume here than when the code runs in the production environment, a debugger won’t be running (hence only the log). In case it is, then there must be a human debugging it (hence the stack trace)
Let’s run the code at the console (like we normally would):
Now let’s run it using pdb:
Voila! We now have more information which will help us investigate the issue further! We now know that in line #20, we are trying to use the variable location_response without checking whether the below line returned a correct response from the API:
location_response = requests.get(META_WEATHER_URL_PREFIX + "search/?query=" + place).json()
Hence the fix would be to check the value of the location_response variable & then proceed in the code.
Tip #2: Instead on calling this function repeatedly on exception, we can initialize a variable, say debug_mode_on at the module level and save the return value of the above function in it. Then whenever we need to perform this check, we can use the variable instead of calling the function.
Invoke the debugger when exceptions occur
In the previous section, we looked at conditionally writing code during exception handling. We separated code based on whether a debugger was running. A more refined scenario would be conditionally writing exception handling code based on the kind of environment our code is executing in. For example, when code runs in a staging environment, we might want a debugger to be spawned right at the time of exception. This is because code running in the staging environment may be closely monitored. For a pre-production environment, we might want some detailed tracing and in the production environment we may log concise error messages only. Let’s see how we can achieve this by modifying the code used in the previous section.
Here, we can see that in the except Exception block, we have checked for the environment we are executing in using an environment variable (os.environ) named EXECUTION_ENVIRONMENT and have segregated the code on the basis of it’s value.
Notice something new in the else block?
In the last line, we have invoked the python debugger itself! The line “pdb.post_mortem(tb)” invokes the python debugger with the current execution information (which contains information on the exception currently being handled)
Assuming we are in the staging environment, let’s see what happens when we execute the code:
See? We now have a debugger right when the exception occurred! This is the same exception which we had discussed previously. Let’s see how investigating the values of some variables will help us further:
We now know that when an unknown place is sent as query parameter to the API, it returns with an empty list. Hence, we should add a check after the API has returned its response. We can now fix this easily! Remember that for other environments the code in this segment won’t be executed, so the code will not invoke the debugger & wait.
Every programmer has his / her own set of debugging techniques which they use for day to day debugging in the local environment. However, much like simulation and reality are different, programmers often do not have the same environment available in the staging / pre-production / production environments.
We started by looking at simple command-line based debugging using pdb. Next, we segregated the code beforehand by printing stack trace only when the debugger is running. This way we ensured that in the production environment, we are logging concisely and otherwise, printing more detailed traces. Finally, we segregated exception handling code based on environment variable which hints at which environment the code is running in. Here, we invoked the debugger right at the time exception occurred and ensured this only gets executed in a staging / local kind of environment.
There can also be segregation of exception handling code based on many other factors such as whether the Python process is running in the background or foreground, which we have not covered but can be explored further.
This blog was an attempt to share some of my learning as a programmer trying to figure out ways to track down bugs in these “reserved” environments. I hope these techniques will surely help you, to write smarter code and have a consistent & faster way of reproducing & debugging bugs across environments.